Subido por Babe P

Arabidopsis Protocols

Anuncio
Methods in
Molecular Biology 1062
Jose J. Sanchez-Serrano
Julio Salinas Editors
Arabidopsis
Protocols
Third Edition
METHODS
IN
M O L E C U L A R B I O LO G Y ™
Series Editor
John M. Walker
School of Life Sciences
University of Hertfordshire
Hatfield, Hertfordshire, AL10 9AB, UK
For further volumes:
http://www.springer.com/series/7651
Arabidopsis Protocols
Third Edition
Edited by
Jose J. Sanchez-Serrano
Centro Nacional de Biotecnología, CSIC, Madrid, Spain
Julio Salinas
Departmento Biologia de Plantas, Centro de Investigaciones Biologicas, CSIC, Madrid, Spain
Editors
Jose J. Sanchez-Serrano
Centro Nacional de Biotecnología
CSIC, Madrid
Spain
Julio Salinas
Departmento Biologia de Plantas
Centro de Investigaciones Biologicas
CSIC, Madrid
Spain
ISSN 1064-3745
ISSN 1940-6029 (electronic)
ISBN 978-1-62703-579-8
ISBN 978-1-62703-580-4 (eBook)
DOI 10.1007/978-1-62703-580-4
Springer New York Heidelberg Dordrecht London
Library of Congress Control Number: 2013948230
© Springer Science+Business Media New York 2014
This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is
concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction
on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation,
computer software, or by similar or dissimilar methodology now known or hereafter developed. Exempted from this
legal reservation are brief excerpts in connection with reviews or scholarly analysis or material supplied specifically for
the purpose of being entered and executed on a computer system, for exclusive use by the purchaser of the work.
Duplication of this publication or parts thereof is permitted only under the provisions of the Copyright Law of the
Publisher’s location, in its current version, and permission for use must always be obtained from Springer. Permissions
for use may be obtained through RightsLink at the Copyright Clearance Center. Violations are liable to prosecution
under the respective Copyright Law.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not
imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and
regulations and therefore free for general use.
While the advice and information in this book are believed to be true and accurate at the date of publication, neither
the authors nor the editors nor the publisher can accept any legal responsibility for any errors or omissions that may be
made. The publisher makes no warranty, express or implied, with respect to the material contained herein.
Printed on acid-free paper
Humana Press is a brand of Springer
Springer is part of Springer Science+Business Media (www.springer.com)
Preface
At present, Arabidopsis thaliana is acknowledged as the most important plant model system
by the scientific community. Over the last years, the continuous efforts of plant scientists
have led to the generation of a vast array of biological tools, and the development and
optimization of research methodology that has altogether prompted the generation of a
massive amount of highly valuable experimental data. Both scientific information and biological materials have been made accessible efficiently through shared public/private
resources such as TAIR and the various biological stock centers, in a praiseworthy example
of collaboration for the optimal use of scientific resources. These initiatives have fueled the
investigation in essentially every aspect of plant biology.
Arabidopsis research has thus fundamentally influenced our understanding of the basic
biology and ecology of plants. Also importantly, the knowledge gained from this model
species is already being translated to other plants, particularly crops, at an always-faster
pace. It is expected that this transfer will soon continue to satisfy the increasing demand for
improved agricultural products, including food, fiber, and biofuel. Interestingly, moreover,
Arabidopsis is becoming an important model system for researchers studying other multicellular organisms, recognizing the advantages of this experimental system for the elucidation of basic, universal biological questions.
We have prepared this third edition of Arabidopsis Protocols in an effort to compile
some of the most recent methodology developed to exploit the Arabidopsis genome. To
this, we have relied on the experience of a significant group of leading experts in the methodologies described. These methodologies cover from the guided access to public resources,
to genetic, cell biological, biochemical, and physiological techniques, including both those
that are widely used and those novel ones likely to open new avenues of knowledge in the
near future. In addition, considering the recent unparalleled progress of the “omics” tools
in Arabidopsis, we include sections on genome, transcriptome, proteome, metabolome,
and other whole-system approaches.
As in previous editions, we have tried to present a collection of step-by-step protocols,
described at a level of detail enough to be followed both by experienced researchers and
beginners. We would finally like to thank all our contributing colleagues whose expertise
and effort has been essential for attaining the highest scientific standard in this book.
Madrid, Spain
Jose J. Sanchez-Serrano
Julio Salinas
v
Contents
Preface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Contributors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
PART I
GROWING ARABIDOPSIS
1 Handling Arabidopsis Plants: Growth, Preservation
of Seeds, Transformation, and Genetic Crosses . . . . . . . . . . . . . . . . . . . . . . . .
Luz Rivero, Randy Scholl, Nicholas Holomuzki, Deborah Crist,
Erich Grotewold, and Jelena Brkljacic
2 Using Arabidopsis-Related Model Species (ARMS): Growth, Genetic
Transformation, and Comparative Genomics . . . . . . . . . . . . . . . . . . . . . . . . . .
Giorgia Batelli, Dong-Ha Oh, Matilde Paino D’Urzo,
Francesco Orsini, Maheshi Dassanayake, Jian-Kang Zhu,
Hans J. Bohnert, Ray A. Bressan, and Albino Maggio
3 Growing Arabidopsis In Vitro: Cell Suspensions,
In Vitro Culture, and Regeneration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Bronwyn J. Barkla, Rosario Vera-Estrella, and Omar Pantoja
PART II
3
27
53
ARABIDOPSIS RESOURCES
4 Arabidopsis Database and Stock Resources . . . . . . . . . . . . . . . . . . . . . . . . . . .
Donghui Li, Kate Dreher, Emma Knee, Jelena Brkljacic, Erich Grotewold,
Tanya Z. Berardini, Philippe Lamesch, Margarita Garcia-Hernandez,
Leonore Reiser, and Eva Huala
5 Bioinformatic Tools in Arabidopsis Research . . . . . . . . . . . . . . . . . . . . . . . . . .
Miguel de Lucas, Nicholas J. Provart, and Siobhan M. Brady
PART III
v
xi
65
97
GENETIC TECHNIQUES
6 Exploiting Natural Variation in Arabidopsis . . . . . . . . . . . . . . . . . . . . . . . . . . .
Johanna A. Molenaar and Joost J.B. Keurentjes
7 Grafting in Arabidopsis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Katherine Bainbridge, Tom Bennett, Peter Crisp, Ottoline Leyser,
and Colin Turnbull
8 Agrobacterium tumefaciens-Mediated Transient Transformation
of Arabidopsis thaliana Leaves. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Silvina Mangano, Cintia Daniela Gonzalez, and Silvana Petruccelli
9 iTILLING: Personalized Mutation Screening . . . . . . . . . . . . . . . . . . . . . . . . .
Susan M. Bush and Patrick J. Krysan
vii
139
155
165
175
viii
Contents
10 Tailor-Made Mutations in Arabidopsis Using Zinc Finger Nucleases . . . . . . . .
Yiping Qi, Colby G. Starker, Feng Zhang, Nicholas J. Baltes,
and Daniel F. Voytas
11 The Use of Artificial MicroRNA Technology to Control Gene Expression
in Arabidopsis thaliana . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Andrew L. Eamens, Marcus McHale, and Peter M. Waterhouse
12 Generation and Identification of Arabidopsis EMS Mutants . . . . . . . . . . . . . . .
Li-Jia Qu and Genji Qin
13 Generation and Characterization of Arabidopsis T-DNA Insertion Mutants. . .
Li-Jia Qu and Genji Qin
14 Identification of EMS-Induced Causal Mutations in Arabidopsis thaliana
by Next-Generation Sequencing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Naoyuki Uchida, Tomoaki Sakamoto, Masao Tasaka,
and Tetsuya Kurata
15 Arabidopsis Transformation with Large Bacterial Artificial Chromosomes . . . .
Jose M. Alonso and Anna N. Stepanova
16 Global DNA Methylation Analysis Using Methyl-Sensitive
Amplification Polymorphism (MSAP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Mahmoud W. Yaish, Mingsheng Peng, and Steven J. Rothstein
PART IV
193
211
225
241
259
271
285
MOLECULAR BIOLOGICAL TECHNIQUES
17 Next-Generation Mapping of Genetic Mutations Using Bulk
Population Sequencing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Ryan S. Austin, Steven P. Chatfield, Darrell Desveaux,
and David S. Guttman
18 Chemical Fingerprinting of Arabidopsis Using Fourier Transform
Infrared (FT-IR) Spectroscopic Approaches. . . . . . . . . . . . . . . . . . . . . . . . . . .
András Gorzsás and Björn Sundberg
19 A Pipeline for 15N Metabolic Labeling and Phosphoproteome Analysis
in Arabidopsis thaliana . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Benjamin B. Minkoff, Heather L. Burch, and Michael R. Sussman
20 Gene Expression Profiling Using DNA Microarrays. . . . . . . . . . . . . . . . . . . . .
Kyonoshin Maruyama, Kazuko Yamaguchi-Shinozaki,
and Kazuo Shinozaki
21 Forward Chemical Genetic Screening . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Hyunmo Choi, Jun-Young Kim, Young Tae Chang, and Hong Gil Nam
22 Highly Reproducible ChIP-on-Chip Analysis to Identify Genome-Wide
Protein Binding and Chromatin Status in Arabidopsis thaliana . . . . . . . . . . . .
Jong-Myong Kim, Taiko Kim To, Maho Tanaka, Takaho A. Endo,
Akihiro Matsui, Junko Ishida, Fiona C. Robertson, Tetsuro Toyoda,
and Motoaki Seki
301
317
353
381
393
405
Contents
PART V
CELL BIOLOGICAL TECHNIQUES
23 Fluorescence Microscopy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Sébastien Peter, Klaus Harter, and Frank Schleifenbaum
24 Immunocytochemical Fluorescent In Situ Visualization
of Proteins In Arabidopsis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Yohann Boutté and Markus Grebe
25 High-Pressure Freezing and Freeze Substitution of Arabidopsis
for Electron Microscopy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Jotham R. Austin II
26 Applications of Fluorescent Marker Proteins in Plant Cell Biology . . . . . . . . . .
Michael R. Blatt and Christopher Grefen
27 Flow Cytometry and Sorting in Arabidopsis . . . . . . . . . . . . . . . . . . . . . . . . . .
David W. Galbraith
28 Live Imaging of Arabidopsis Development . . . . . . . . . . . . . . . . . . . . . . . . . . .
Daniel von Wangenheim, Gabor Daum, Jan U. Lohmann,
Ernst K. Stelzer, and Alexis Maizel
29 Arabidopsis Organelle Isolation and Characterization . . . . . . . . . . . . . . . . . . .
Nicolas L. Taylor, Elke Ströher, and A. Harvey Millar
PART VI
ix
429
453
473
487
509
539
551
BIOCHEMICAL AND PHYSIOLOGICAL TECHNIQUES
30 Analysis of Subcellular Metabolite Distributions Within Arabidopsis thaliana
Leaf Tissue: A Primer for Subcellular Metabolomics . . . . . . . . . . . . . . . . . . . .
Stephan Krueger, Dirk Steinhauser, Jan Lisec, and Patrick Giavalisco
31 Hormone Profiling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Gaetan Glauser, Armelle Vallat, and Dirk Balmer
32 Purification of Protein Complexes and Characterization
of Protein-Protein Interactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Kirby N. Swatek, Chris B. Lee, and Jay J. Thelen
33 Protein Fragment Bimolecular Fluorescence Complementation Analyses
for the In Vivo Study of Protein-Protein Interactions and Cellular Protein
Complex Localizations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Rainer Waadt, Kathrin Schlücking, Julian I. Schroeder, and Jörg Kudla
34 The Split-Ubiquitin System for the Analysis
of Three-Component Interactions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Christopher Grefen
35 RNA-Binding Protein Immunoprecipitation from Whole-Cell Extracts . . . . . .
Tino Köster and Dorothee Staiger
36 High-Throughput Analysis of Protein-DNA Binding Affinity . . . . . . . . . . . . .
José M. Franco-Zorrilla and Roberto Solano
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
575
597
609
629
659
679
697
711
Contributors
JOSE M. ALONSO • Department of Genetics, North Caroline State University, Raleigh,
NC, USA
JOTHAM R. AUSTIN II • Advance Electron Microscopy Facility, Department of Molecular
Genetics and Cell Biology, University of Chicago, Chicago, IL, USA
RYAN S. AUSTIN • Southern Crop Protection and Food Research Centre,
Agriculture & Agri-Food Canada, London, ON, Canada; Department of Cell &
Systems Biology, University of Toronto, Toronto, ON, Canada
KATHERINE BAINBRIDGE • Department of Biology, University of York, York, UK
DIRK BALMER • Laboratory of Molecular and Cell Biology, Institute of Biology, University of
Neuchâtel, Neuchâtel, Switzerland
NICHOLAS J. BALTES • Department of Genetics, Cell Biology & Development and Center
for Genome Engineering, University of Minnesota, Minneapolis, MN, USA
BRONWYN J. BARKLA • Instituto de Biotecnología, Universidad Nacional Autónoma de
México, Cuernavaca, Morelos, Mexico
GIORGIA BATELLI • CNR-IGV Institute of Plant Genetics, Portici, Italy
TOM BENNETT • Department of Biology, University of York, York, UK
TANYA Z. BERARDINI • Department of Plant Biology, Carnegie Institution for Science,
Stanford, CA, USA
MICHAEL R. BLATT • Laboratory of Plant Physiology and Biophysics, University of Glasgow,
Glasgow, UK
HANS J. BOHNERT • Department of Plant Biology, University of Illinois at Urbana-Champaign,
Urbana, IL, USA; Division of Applied Science, Gyeongsang National University, Jinju,
South Korea; College of Science, King Abdulaziz University, Jeddah, Kingdom of
Saudi Arabia
YOHANN BOUTTÉ • Department of Forest Genetics and Plant Physiology, UPSC, Swedish
University of Agricultural Sciences, Umeå, Sweden; Membrane biogenesis laboratory,
CNRS, UMR5200, Victor Ségalen Bordeaux 2 University, Bordeaux, France
SIOBHAN M. BRADY • Department of Plant Biology and Genome Center, UC Davis, Davis,
CA, USA
RAY A. BRESSAN • Division of Applied Science, Gyeongsang National University,
Jinju, South Korea; Department of Horticulture and Landscape Architecture,
Purdue University, West Lafayette, IN, USA; College of Science, King Abdulaziz
University, Jeddah, Kingdom of Saudi Arabia
JELENA BRKLJACIC • Arabidopsis Biological Resource Center, The Ohio State University,
Columbus, OH, USA
HEATHER L. BURCH • Biotechnology Center, University of Wisconsin-Madison, Madison,
WI, USA
SUSAN M. BUSH • Department of Plant Biology, University of California-Davis, Davis,
CA, USA
xi
xii
Contributors
YOUNG TAE CHANG • Department of Chemistry, National University of Singapore,
Singapore, Singapore
STEVEN P. CHATFIELD • Department of Cell & Systems Biology, University of Toronto,
Toronto, ON, Canada
HYUNMO CHOI • Department of Life Science, Pohang University of Science and Technology,
Pohang, Republic of Korea
PETER CRISP • Research School of Biology, Australian National University, Canberra,
ACT, Australia
DEBORAH CRIST • Arabidopsis Biological Resource Center, Center for Applied Plant
Sciences, Department of Molecular Genetics, The Ohio State University, Columbus,
OH, USA
MATILDE PAINO D’URZO • Department of Horticulture and Landscape Architecture,
Purdue University, West Lafayette, IN, USA
MAHESHI DASSANAYAKE • Department of Plant Biology, University of Illinois at
Urbana-Champaign, Urbana, IL, USA
GABOR DAUM • Department of Stem Cell Biology, University of Heidelberg,
Heidelberg, Germany; Centre for Organismal Studies, University of Heidelberg,
Heidelberg, Germany
MIGUEL DE LUCAS • Department of Plant Biology and Genome Center, UC Davis,
Davis, CA, USA
DARRELL DESVEAUX • Department of Cell & Systems Biology, University of Toronto,
Toronto, ON, Canada; Centre for the Analysis of Genome Evolution & Function,
University of Toronto, Toronto, ON, Canada
KATE DREHER • Department of Plant Biology, Carnegie Institution for Science, Stanford,
CA, USA
ANDREW L. EAMENS • School of Environmental and Life Sciences, University of Newcastle,
Callaghan, NSW, Australia
TAKAHO A. ENDO • RIKEN Bioinformatics and Systems Engineering Division,
Yokohama, Japan
JOSÉ M. FRANCO-ZORRILLA • Genomics Unit, Centro Nacional de Biotecnología-CSIC,
Madrid, Spain
DAVID W. GALBRAITH • School of Plant Sciences, University of Arizona, Tuczon, AZ, USA
MARGARITA GARCIA-HERNANDEZ • Department of Plant Biology, Carnegie Institution for
Science, Stanford, CA, USA
PATRICK GIAVALISCO • Department of Molecular Physiology, Max Planck Institute of
Molecular Plant Physiology, Potsdam-Golm, Germany
GAETAN GLAUSER • Chemical Analytical Service of the Swiss Plant Science Web, Institute of
Biology, University of Neuchâtel, Neuchâtel, Switzerland
CINTIA DANIELA GONZALEZ • Departamento de Ciencias Biológicas, Facultad de Ciencias
Exactas, Centro de Investigación y Desarrollo en Criotecnología de Alimentos
(CIDCA)-CCT-La Plata-CONICET, Universidad de La Plata, La Plata, Argentina
ANDRÁS GORZSÁS • Department of Chemistry, Umeå University, Umeå, Sweden
MARKUS GREBE • Department of Plant Physiology, Umeå Plant Science Centre (UPSC),
Umeå University, Fysiologihuset Byggnad L, Umeå, Sweden
CHRISTOPHER GREFEN • Emmy Noether Research Group Leader, ZMBP, Developmental
Genetics, Tuebingen, Germany
ERICH GROTEWOLD • Arabidopsis Biological Resource Center, The Ohio State University,
Columbus, OH, USA
Contributors
xiii
DAVID S. GUTTMAN • Department of Cell & Systems Biology, University of Toronto, Toronto,
ON, Canada; Centre for the Analysis of Genome Evolution & Function, University of
Toronto, Toronto, ON, Canada
KLAUS HARTER • Center for Plant Molecular Biology, University of Tuebingen,
Tuebingen, Germany
NICHOLAS HOLOMUZKI • Arabidopsis Biological Resource Center, Center for Applied Plant
Sciences, Department of Molecular Genetics, The Ohio State University, Columbus,
OH, USA
EVA HUALA • Department of Plant Biology, Carnegie Institution for Science, Stanford,
CA, USA
JUNKO ISHIDA • Plant Genomic Network Research Team, RIKEN Plant Science Center,
Yokohama, Japan
JOOST J.B. KEURENTJES • Laboratory of Genetics, Wageningen University, Wageningen,
The Netherlands
JONG-MYONG KIM • Plant Genomic Network Research Team, RIKEN Plant Science
Center, Yokohama, Japan
JUN-YOUNG KIM • Department of Chemistry, National University of Singapore,
Singapore, Singapore
EMMA KNEE • Arabidopsis Biological Resource Center, The Ohio State University,
Columbus, OH, USA
TINO KÖSTER • Department of Molecular Cell Physiology, Institute for Genome Research
and Systems Biology, University of Bielefeld, Bielefeld, Germany
STEPHAN KRUEGER • Botanical Institute II, University of Cologne, Cologne, Germany
PATRICK J. KRYSAN • Department of Horticulture and Genome Center of Wisconsin,
University of Wisconsin-Madison, Madison, WI, USA
JÖRG KUDLA • Molekulargenetik und Zellbiologie der Pflanzen, Institut für Biologie und
Biotechnologie der Pflanzen, Universität Münster, Münster, Germany
TETSUYA KURATA • Plant Global Education Project, Graduate School of Biological Sciences,
Nara Institute of Science and Technology, Ikoma, Japan
PHILIPPE LAMESCH • Department of Plant Biology, Carnegie Institution for Science,
Stanford, CA, USA
CHRIS B. LEE • Department of Biochemistry, Life Sciences Center, University of Missouri,
Columbia, MO, USA
OTTOLINE LEYSER • Department of Biology, University of York, York, UK; Sainsbury
Laboratory, University of Cambridge, Cambridge, UK
DONGHUI LI • Department of Plant Biology, Carnegie Institution for Science, Stanford,
CA, USA
JAN LISEC • Department of Molecular Physiology, Max Planck Institute of Molecular
Plant Physiology, Potsdam-Golm, Germany
JAN U. LOHMANN • Department of Stem Cell Biology, University of Heidelberg,
Heidelberg, Germany; Centre for Organismal Studies, University of Heidelberg,
Heidelberg, Germany
ALBINO MAGGIO • Department of Agricultural Engineering and Agronomy, University of
Naples Federico II, Portici, Italy
ALEXIS MAIZEL • Centre for Organismal Studies, University of Heidelberg,
Heidelberg, Germany
xiv
Contributors
SILVINA MANGANO • Departamento de Ciencias Biológicas, Facultad de Ciencias Exactas,
Centro de Investigación y Desarrollo en Criotecnología de Alimentos (CIDCA)-CCT-La
Plata-CONICET, Universidad de La Plata, La Plata, Argentina
KYONOSHIN MARUYAMA • Biological Resources and Post-harvest Division, Japan
International Research Center for Agricultural Sciences, Tsukuba, Ibaraki, Japan
AKIHIRO MATSUI • Plant Genomic Network Research Team, RIKEN Plant Science Center,
Yokohama, Japan
MARCUS MCHALE • School of Molecular Sciences, University of Sydney, Sydney,
NSW, Australia
A. HARVEY MILLAR • ARC Centre of Excellence in Plant Energy Biology and Centre for
Comparative Analysis of Biomolecular Networks (CABiN), The University of Western
Australia, Crawley, WA, Australia
BENJAMIN B. MINKOFF • Department of Biochemistry, University of Wisconsin-Madison,
Madison, WI, USA
JOHANNA A. MOLENAAR • Laboratory of Plant Physiology, Wageningen University,
Wageningen, The Netherlands
HONG GIL NAM • Academy of New Biology for Plant Senescence and Life History, Institute
for Basic Science & Department of New Biology, Daegu Gyeongbuk Institute of Science
and Technology, Dalseong-Gun, Daegu, Republic of Korea
DONG-HA OH • Department of Plant Biology, University of Illinois at Urbana-Champaign,
Urbana, IL, USA; Division of Applied Science, Gyeongsang National University, Jinju,
South Korea
FRANCESCO ORSINI • Department of Agro-Environmental Sciences and Technology,
University of Bologna, Bologna, Italy
OMAR PANTOJA • Instituto de Biotecnología, Universidad Nacional Autónoma de México,
Cuernavaca, Morelos, Mexico
MINGSHENG PENG • Monsanto Company, Chesterfield, MO, USA
SÉBASTIEN PETER • Center for Plant Molecular Biology, University of Tuebingen,
Tuebingen, Germany
SILVANA PETRUCCELLI • Departamento de Ciencias Biológicas, Facultad de Ciencias
Exactas, Centro de Investigación y Desarrollo en Criotecnología de Alimentos
(CIDCA)-CCT-La Plata-CONICET, Universidad de La Plata, La Plata, Argentina
NICHOLAS J. PROVART • Department of Cell & Systems Biology, Centre for the Analysis of
Genome Evolution and Function, Toronto, ON, Canada
YIPING QI • Department of Genetics, Cell Biology & Development and Center for Genome
Engineering, University of Minnesota, Minneapolis, MN, USA
GENJI QIN • State Key Laboratory of Protein and Plant Gene Research, Center for Life
Sciences, College of Life Sciences, Peking University, Beijing, People’s Republic of China
LI-JIA QU • State Key Laboratory of Protein and Plant Gene Research, Center for Life
Sciences, College of Life Sciences, Peking University, Beijing, People’s Republic of China
LEONORE REISER • Department of Plant Biology, Carnegie Institution for Science, Stanford,
CA, USA
LUZ RIVERO • Arabidopsis Biological Resource Center, Center for Applied Plant Sciences,
Department of Molecular Genetics, The Ohio State University, Columbus, OH, USA
FIONA C. ROBERTSON • Plant Genomic Network Research Team, RIKEN Plant Science
Center, Yokohama, Japan
STEVEN J. ROTHSTEIN • Department of Molecular and Cellular Biology, University of
Guelph, Guelph, ON, Canada
Contributors
xv
TOMOAKI SAKAMOTO • Plant Global Education Project, Graduate School of Biological
Sciences, Nara Institute of Science and Technology, Ikoma, Japan
FRANK SCHLEIFENBAUM • Center for Plant Molecular Biology, University of Tuebingen,
Tuebingen, Germany; Berthold Technologies GmbH & Co KG, Bad Wildbad, Germany
KATHRIN SCHLÜCKING • Molekulargenetik und Zellbiologie der Pflanzen, Institut für
Biologie und Biotechnologie der Pflanzen, Universität Münster, Münster, Germany
RANDY SCHOLL • Arabidopsis Biological Resource Center, Center for Applied Plant Sciences,
Department of Molecular Genetics, The Ohio State University, Columbus, OH, USA
JULIAN I. SCHROEDER • Division of Biological Sciences, Cell and Developmental Biology
Section and Center for Food and Fuel for the 21st Century, University of California San
Diego, La Jolla, CA, USA
MOTOAKI SEKI • Plant Genomic Network Research Team, RIKEN Center for Sustainable
Resource Science, Yokohama, Japan; Kihara Institute for Biological Research, Yokohama
City University, Yokohama, Japan
KAZUO SHINOZAKI • RIKEN Center for Sustainable Resource Science, Suehiro-cho,
Tsurumi-ku, Yokohama, Japan
ROBERTO SOLANO • Department of Plant Molecular Genetics, Centro Nacional de
Biotecnología-CSIC, Madrid, Spain
DOROTHEE STAIGER • Department of Molecular Cell Physiology, Institute for Genome
Research and Systems Biology, University of Bielefeld, Bielefeld, Germany
COLBY G. STARKER • Department of Genetics, Cell Biology & Development and Center
for Genome Engineering, University of Minnesota, Minneapolis, MN, USA
DIRK STEINHAUSER • Department of Molecular Physiology, Max Planck Institute of
Molecular Plant Physiology, Potsdam-Golm, Germany
ERNST K. STELZER • Physical Biology, Frankfurt Institute for Molecular Life Sciences
(FMLS), Goethe Universität Frankfurt am Main, Frankfurt am Main, Germany
ANNA N. STEPANOVA • Department of Genetics, North Caroline State University, Raleigh,
NC, USA
ELKE STRÖHER • ARC Centre of Excellence in Plant Energy Biology and Centre for
Comparative Analysis of Biomolecular Networks (CABiN), The University of Western
Australia, Crawley, WA, Australia
BJÖRN SUNDBERG • Department of Forest Genetics and Plant Physiology, Swedish University
of Agricultural Sciences, Umeå, Sweden
MICHAEL R. SUSSMAN • Department of Biochemistry, University of Wisconsin-Madison,
Madison, WI, USA; Biotechnology Center, University of Wisconsin-Madison, Madison,
WI, USA
KIRBY N. SWATEK • Department of Biochemistry, Life Sciences Center, University of
Missouri, Columbia, MO, USA
MAHO TANAKA • Plant Genomic Network Research Team, RIKEN Plant Science Center,
Yokohama, Japan
MASAO TASAKA • Graduate School of Biological Sciences, Nara Institute of Science
and Technology, Ikoma, Japan
NICOLAS L. TAYLOR • ARC Centre of Excellence in Plant Energy Biology and Centre for
Comparative Analysis of Biomolecular Networks (CABiN), The University of Western
Australia, Crawley, WA, Australia
JAY J. THELEN • Department of Biochemistry, Life Sciences Center, University of Missouri,
Columbia, MO, USA
xvi
Contributors
TAIKO KIM TO • Plant Genomic Network Research Team, RIKEN Plant Science Center,
Yokohama, Japan; Department of Integrated Genetics, National Institute of Genetics,
Mishima, Japan
TETSURO TOYODA • RIKEN Bioinformatics and Systems Engineering Division,
Yokohama, Japan
COLIN TURNBULL • Division of Cell & Molecular Biology, Imperial College of London,
London, UK
NAOYUKI UCHIDA • Graduate School of Biological Sciences, Nara Institute of Science
and Technology, Ikoma, Japan
ARMELLE VALLAT • Service Analytique Facultaire, Institute of Chemistry, University of
Neuchâtel, Neuchâtel, Switzerland
ROSARIO VERA-ESTRELLA • Instituto de Biotecnología, Universidad Nacional Autónoma de
México, Cuernavaca, Morelos, Mexico
DANIEL F. VOYTAS • Department of Genetics, Cell Biology and Development, Center for
Genome Engineering, University of Minnesota, Minneapolis, MN, USA
RAINER WAADT • Division of Biological Sciences, Cell and Developmental Biology Section and
Center for Food and Fuel for the 21st Century, University of California San Diego,
La Jolla, CA, USA
DANIEL VON WANGENHEIM • Physical Biology, Frankfurt Institute for Molecular Life
Sciences (FMLS), Goethe Universität Frankfurt am Main, Frankfurt am Main,
Germany
PETER M. WATERHOUSE • School of Molecular Sciences, University of Sydney, Sydney,
NSW, Australia
MAHMOUD W. YAISH • Department of Biology, College of Science, Sultan Qaboos University,
Muscat, Oman
KAZUKO YAMAGUCHI-SHINOZAKI • Biological Resources and Post-harvest Division, Japan
International Research Center for Agricultural Sciences, Tsukuba, Ibaraki, Japan;
Laboratory of Plant Molecular Physiology, Graduate School of Agricultural and Life
Sciences, The University of Tokyo, Tokyo, Japan
FENG ZHANG • Cellectic Plant Sciences, St. Paul, MN, USA
JIAN-KANG ZHU • Department of Horticulture and Landscape Architecture, Purdue
University, West Lafayette, IN, USA
Part I
Growing Arabidopsis
Chapter 1
Handling Arabidopsis Plants: Growth, Preservation
of Seeds, Transformation, and Genetic Crosses
Luz Rivero, Randy Scholl, Nicholas Holomuzki, Deborah Crist,
Erich Grotewold, and Jelena Brkljacic
Abstract
Growing healthy plants is essential for the advancement of Arabidopsis thaliana (Arabidopsis) research.
Over the last 20 years, the Arabidopsis Biological Resource Center (ABRC) has collected and developed a
series of best-practice protocols, some of which are presented in this chapter. Arabidopsis can be grown in
a variety of locations, growth media, and environmental conditions. Most laboratory accessions and their
mutant or transgenic derivatives flower after 4–5 weeks and set seeds after 7–8 weeks, under standard
growth conditions (soil, long day, 23 ºC). Some mutant genotypes, natural accessions, and Arabidopsis
relatives require strict control of growth conditions best provided by growth rooms, chambers, or incubators. Other lines can be grown in less-controlled greenhouse settings. Although the majority of lines can
be grown in soil, certain experimental purposes require utilization of sterile solid or liquid growth media.
These include the selection of primary transformants, identification of homozygous lethal individuals in a
segregating population, or bulking of a large amount of plant material. The importance of controlling,
observing, and recording growth conditions is emphasized and appropriate equipment required to perform monitoring of these conditions is listed. Proper conditions for seed harvesting and preservation, as
well as seed quality control, are also described. Plant transformation and genetic crosses, two of the methods that revolutionized Arabidopsis genetics, are introduced as well.
Key words Arabidopsis, Growth conditions, Environmental conditions, Natural accession, Seed
germination, Seed quality, Plant transformation, Genetic crosses
1
Introduction
Healthy growth and development of plants is a prerequisite for
accurate and reproducible plant research and Arabidopsis thaliana
(Arabidopsis) is no exception. Proper handling and maintenance of
Arabidopsis plants also enables a high rate of seed production. In
this chapter, we describe basic, best-practice protocols needed for
handling Arabidopsis. The reader should be aware, however, that
most of the commonly used growth environmental conditions,
particularly in greenhouses, may not be similar to the ones in the
native habitats of some natural accessions. This is especially
Jose J. Sanchez-Serrano and Julio Salinas (eds.), Arabidopsis Protocols, Methods in Molecular Biology, vol. 1062,
DOI 10.1007/978-1-62703-580-4_1, © Springer Science+Business Media New York 2014
3
4
Luz Rivero et al.
important for interpreting phenotypic differences of traits that are
known to be strongly influenced by the natural habitat, such as
flowering time. Therefore, the protocols described here should be
taken only as a guide for the experimental setup and design.
This chapter will address (1) the growth of Arabidopsis plants in
a variety of environmental settings including growth chambers and
greenhouses, as well as in vitro, (2) critical, and optimal conditions
to grow healthy Arabidopsis plants, including quality control measures, (3) harvesting, seed preservation, and seed quality control,
(4) genetic crosses, and (5) transformation with Agrobacterium
tumefaciens (Agrobacterium). Significant emphasis is placed on the
equipment required for controlling and monitoring environmental
conditions during plant growth. The plant and seed management
protocols are given in chronological order.
2
Materials
2.1 Plant Growth
and Seed Harvest
1. Arabidopsis seeds can be obtained from the public stock centers:
Arabidopsis Biological Resource Center (ABRC, abrc.osu.edu),
European Arabidopsis Stock Centre (NASC, arabidopsis.info),
RIKEN BioResource Center (RIKEN BRC, www.brc.riken.
jp/inf/en), French National Institute for Agricultural Research
(INRA, cnrgv.toulouse.inra.fr/en), and other laboratory
sources [1] and private sources such as Lehle Seeds (arabidopsis.com).
2. Sterile plastic Petri dishes (plates) (10 or 15 cm diameter).
3. Murashige and Skoog basal salt mixture (MS).
4. 2-(N-Morpholino) ethanesulfonic acid (MES).
5. Agar granulated.
6. Sucrose.
7. Gamborg’s Vitamin Solution.
8. KOH.
9. Distilled water.
10. Magnetic stirring device.
11. Beakers (1 L).
12. Glass bottles (1 L).
13. pH meter.
14. Microcentrifuge tubes.
15. Disposable Pasteur pipettes.
16. Pipetman and pipette tips.
17. Household bleach (5.25 % w/v sodium hypochlorite).
18. Tween® 20.
Handling Arabidopsis Plants
5
19. Labeling tape or printable labels.
20. Permanent marker.
21. 3M Micropore surgical paper tape.
22. Thiamine hydrochloride, plant cell culture tested.
23. Double distilled water (ddH2O).
24. 2,4-Dichlorophenoxyacetic acid (2,4-D), plant cell culture
tested, >98 %.
25. Ethanol, absolute, 200 proof, for molecular biology.
26. 0.45-μm filter sterilization unit.
27. Myoinositol, plant cell culture tested.
28. KH2PO4.
29. NaOH.
30. Soil mix, e.g., Sunshine® LC1 mix (Sun Gro Horticulture,
www.sungro.com) or other peat moss-based potting mix.
31. Fertilizer in slow release pellets, e.g., Osmocote® 14-14-14
(Hummert™ International, www.Hummert.com).
32. Plastic pots with holes in the bottom (e.g., 11 cm diameter, 5.5 cm
square) or plastic flats (e.g., 26 cm × 53 cm) with clear domes.
33. Trowel or large spoon.
34. 70 mm filter paper.
35. Pest Trap™ colored sticky cards (Hummert™ International).
36. Enstar® II (Hummert™ International).
37. Conserve® SC (Hummert™ International).
38. Marathon® 1G, granular systemic insecticide (Hummert™
International).
39. Sulfur vaporizer and bulk pelleted sulfur (HID Hut Inc., www.
hidhut.com).
40. Spor-Klenz® Ready-To-Use Cold Sterilant (Steris, www.steris.
com).
41. Tornado™/Flex cold fog ULV mist sprayer (Curtis Dyna-Fog
Ltd., www.dynafog.com).
42. Plastic transparent floral sleeves, e.g., straight sleeve BOPP
60 × 40 × 15 cm (www.zwapak.com) for 11-cm-diameter pots,
or other devices for plant isolation such as Aracons™ (Lehle
Seeds and Arasystem, www.arasystem.com) or lightweight
plastic bags (4–8 L).
43. Hand sieve, e.g., US Standard Stainless Steel Test Sieve No. 40
(Fisher Scientific).
44. Small manila envelopes (e.g., 6 cm × 9 cm) or small glass jars
(125 mL) or other containers.
6
Luz Rivero et al.
2.2 Control
of Environmental
Growth Conditions for
Optimal Plant Growth
1. Data loggers, e.g., HOBO® U14 LCD (Onset, www.onsetcomp.
com).
2.3 Preparation
of Seeds for Shortand Long-Term
Storage
1. 2-mL polypropylene cryovials with threaded lids and gaskets
(e.g., screw cap micro tubes, manufactured by Sarstedt Inc.,
available from Fisher Scientific) or other sealed containers for
permanent seed storage.
2. Permanent marker or printed labels.
2.4 Seed Quality
Control
1. Dissecting microscope or magnifying lenses.
2. Plastic Petri plates (10 cm diameter) or other similar
containers.
3. Absorbent paper, e.g., filter paper 10 cm diameter.
4. Permanent marker or printed labels.
5. Distilled water.
6. Parafilm or tape.
2.5
Genetic Crosses
1. DV-30 Precision Swiss clamping tweezers (Lehle Seeds).
2. Optical glass binocular magnifier, e.g., OptiVISOR® (Donegan
Optical Company, www.doneganoptical.com) or dissecting
microscope.
3. 1.5-mL microcentrifuge tubes.
4. Small scissors.
5. Laboratory tape in various colors and permanent marker.
2.6 Transformation
of Arabidopsis with
Agrobacterium
tumefaciens
1. Agrobacterium transformed with a construct of interest.
2. LB medium.
3. Selection antibiotics.
4. Sucrose.
5. Silwet L-77®.
3
Methods
3.1 Growth
of Arabidopsis
Plants and Cultures
3.1.1 Growth of Plants in
Sterile Conditions on Solid
Media
Growth of Arabidopsis in experimental settings such as selection of
drug-resistant and transformed plants, examination of early root
and shoot phenotypes, and identification of homozygous lethal
mutants is typically conducted in sterile conditions on solid media.
Liquid bleach sterilization, described here, is a practical method to
sterilize a few seed lines at a time. Larger numbers of lines can be
sterilized easily and with less manipulation using chlorine gas.
Chlorine gas can also be utilized for seeds infested with powdery
Handling Arabidopsis Plants
7
mildew or other fungal diseases. Various containers such as Petri
plates, Magenta® boxes, or culture tubes are used, depending on
the purpose of the experiment. This section describes the use of the
most commonly employed medium for sterile growth conditions in
Petri plates (1× MS agar media). Adaptation to other sterile formats
is straightforward, and most experimental additives can be easily
incorporated in the preparation.
1. Add 4.31 g of MS basal salt mixture [2] and 0.5 g of MES to
a beaker containing 0.8 L of distilled water and stir to dissolve.
Add distilled water to final volume of 1 L. Check and adjust
pH to 5.7 using 1 M KOH.
2. Divide the media into two 1 L bottles, 500 mL in each. Add
5 g of agar per bottle. Keep the lid loose.
3. Autoclave for 20 min at 121 °C, 15 psi with a magnetic stir bar
in the bottle.
4. Place the bottles on a stir plate at low speed and allow the agar
medium to cool to 45–50 °C (until the container can be held
with bare hands).
5. Starting from this step, perform all the steps in sterile conditions
in a laminar flow hood. Add (optional) 1–2 % sucrose and
1 mL Gamborg’s Vitamin Solution, stirring to evenly dissolve
(see Notes 1 and 2).
6. Label the bottom of Petri plates with identification number or
name, including the date.
7. Pour enough media into plates to cover approximately half of
the depth of the plate.
8. Allow the plates to cool at room temperature for about an
hour to allow the agar to solidify. If the plates are not to be
used immediately, wrap them in plastic and store at 4 °C
(refrigerator temperature) (see Note 3).
9. Surface-sterilize seeds in microcentrifuge tubes by soaking for
20 min in 50 % bleach with the addition of 0.05 % Tween® 20
detergent.
10. Remove all bleach residue by rinsing five to seven times with
sterile distilled water.
11. For planting of individual seeds at low density, adhere one seed
to the tip of a pipette using suction, then release seed onto the
agar in desired location. For planting seeds at higher densities,
mix seeds in sterile distilled water (or 0.1 % cooled top agar),
pour onto plate, and immediately swirl to achieve even distribution. Use a sterile pipette tip to adjust the distribution and
remove excess water. Allow the water or top agar to dry slightly
before placing lid onto plate.
12. Seal with Micropore tape to prevent desiccation, while allowing
slight aeration.
8
Luz Rivero et al.
13. Place the plates at 4 °C for 3 days (see Notes 4 and 5).
14. Transfer the plates to the growth environment. Illumination of
120–150 μmol/m2 s continuous light and a temperature of
22–23 °C are suitable growth conditions (see Notes 6–8).
3.1.2 Growth of Plants
in Sterile Conditions in
Liquid Media
Seedlings of Arabidopsis can also be grown in liquid growth media.
This method provides large amounts of plant tissue suitable for
proteomics and metabolomics or any study that requires a larger
amount of starting material. Liquid culture growth is also widely
used for high-throughput genomic studies. In this case, growth
protocols are adapted to 96-deep-well plates (or other formats)
with the MS media supplemented by gibberellic acid.
1. Prepare MS media, as described in Subheading 3.1.1. Do not
add agar.
2. After the media has been autoclaved and cooled to room temperature, distribute 75–100-mL MS media into previously
sterilized 250-mL Erlenmeyer flasks in a laminar flow hood.
3. Add bleach- or chlorine gas-sterilized seeds to the media (add
up to 10 μL of seeds to each flask, which corresponds to
approximately 250 seeds).
4. Grow seedlings under continuous light (120–150 μmol/m2 s)
with gentle rotation in an orbital shaker at 120 rpm for up to
2 weeks.
5. Remove seedlings from the flask. Growth of more than 200–
250 seedlings for more than 2 weeks may result in difficulty
removing plant material from the flask.
6. Remove excess media from the seedlings using filter paper.
Plant material is now ready for downstream applications.
3.1.3 Growth
of Arabidopsis
Cells in Culture
Cell suspension cultures represent a source of nearly uniform cell
material for functional genomics and biochemical, physiological,
and metabolomic studies that can be performed under tightly
controlled environmental conditions. Several cell cultures derived
from Arabidopsis tissue explants have been described. Among
these, T87 and MM1/MM2d have been most widely used. The
T87 cell line originates from the Columbia accession seedlings and
can photosynthesize in light [3]. It has been utilized to analyze
gene expression changes under stress conditions, hormone signaling pathways, the circadian clock, and plant cell wall biosynthesis
[4–6]. Transient and stable transformation protocols for this line
have also been established [4, 7]. Unlike T87, MM1 (light grown)
and MM2d (dark grown) cell lines, derived from Landsberg erecta
accession, are synchronous and can therefore be used for cell-cycle
studies [8]. Due to limited space, only the protocol describing
maintenance of T87 cell culture will be described here.
Handling Arabidopsis Plants
9
1. Prepare 10 mg/mL thiamine stock solution by dissolving 0.1 g
of thiamine in 10 mL of ddH2O. Filter-sterilize, aliquot 1 mL
into microcentrifuge tubes, and store at −20 ºC.
2. Prepare 2,4-D stock solution by dissolving 0.2 g of 2,4-D in
100 mL of 25 % ethanol. Filter-sterilize, aliquot 1 mL into
microcentrifuge tubes and store at −20 ºC.
3. Prepare 1 L of NT-1 media by adding 4.3 g of MS salt mixture,
30 g sucrose, 0.18 g KH2PO4, 100 μL of 10 mg/mL thiamine
stock, 220 μL of 2-mg/mL 2,4-D stock, and 100 mg myoinositol to a bottle containing 0.8 L of ddH2O and stir to dissolve
(see Note 9).
4. Adjust the pH to 5.8 using 5 M NaOH. Add ddH2O to final
volume of 1 L.
5. Distribute 75-mL media into 250-mL Erlenmeyer flasks.
Cover flasks with aluminum foil (see Note 10).
6. Autoclave for 20 min. Let the media cool to room temperature.
7. In a laminar flow hood, transfer 3 mL of 1-week-old T87 cell
suspension culture into a flask containing 75 mL of NT-1
medium (see Notes 11 and 12).
8. Grow the culture at 24 °C under continuous light (40–
100 μmol/m2 s) with gentle rotation in an orbital shaker at
120 rpm.
9. Subculture weekly by transferring cells into fresh NT-1 media,
as described in step 8 (see Note 13).
3.1.4 Planting
Arabidopsis Seeds on Soil
Diverse mixes and media can be used for growing Arabidopsis.
The term “soil” will be used here for any mix or media utilized for
non-sterile growth of plants in pots or similar containers.
Commercial potting mixes are popular with Arabidopsis researchers due to their convenience and reliability. Potting media often
employ peat moss for moisture retention and perlite for aeration.
Mixes such as Sunshine® LC1 support healthy Arabidopsis growth
and include a starter nutrient charge, so that fertilization is not
necessary in early growth phases. Seeds can be planted by various
methods (see Note 14). Soil can be autoclaved to eliminate pests,
but this is usually not necessary. Preparation of soil for planting in
pots can be accomplished as follows:
1. Place soil in a clean container. Add Osmocote® 14-14-14
fertilizer (see Note 15). Wet thoroughly with tap water and
mix well with trowel, large spoon, or hands.
2. Label pots or trays with the stock number or name and date of
planting (see Note 16).
3. Place soil loosely in pots or other containers and level, without
compressing, to generate a uniform and soft bed. Pots are then
ready for planting (see Note 17).
10
Luz Rivero et al.
4. When planting many seeds in a pot, scatter them carefully from
a folded piece of 70-mm filter or other paper; distribute them
evenly onto the surface of the soil (see Note 18). When planting individual seeds, adhere one seed to the tip of a pipette
using suction, then release onto the soil. Planted seeds should
not be covered with additional soil, since Arabidopsis seeds
require light for germination.
5. Place pot(s) in a tray, flat, or other container.
6. Cover with a plastic dome or with clear plastic wrap taped to
the container (see Note 19).
7. Place pots at 4 °C for 3 days (see Note 4).
8. Transfer pots into the growth area.
9. Remove plastic dome or wrap for growth in the greenhouse,
but leave them on until germinated seedlings are visible for
plants grown in a growth chamber.
3.2 Growth
Conditions
The growth and development of Arabidopsis, including flowering
time, is influenced by a number of environmental conditions in
addition to the genetic background. Seeds of most lines germinate
3–5 days after planting under continuous light, 23 °C, adequate
watering, and good nutrition. Plants produce their first flowers
within 4–5 weeks, and seeds can be harvested 8–10 weeks after
planting. High-quality seeds can be produced if watering, light,
and temperature are carefully controlled.
For vigorous plant growth, the optimum light intensity is 120–
150 μmol/m2 s (see Notes 6 and 7) and the optimum temperature
is 22–23 °C (see Notes 8 and 20). Water requirement is strongly
influenced by relative humidity. Plants tolerate low (20–30 %) relative humidity well, but depletion of soil moisture may occur in
these conditions. Plant sterility may result from very high (>90 %)
relative humidity. Mild humidity (50–60 %) is considered optimal
for plant growth; however, low humidity (<50 %) is recommended
for silique maturation.
The following growth practices are useful for handling plants in
any growth context (greenhouse, growth chamber, or growth room):
1. Add water to trays containing pots with perforated bottoms.
2. Maintain approximately 2 cm of water around base of pots
during germination, to avoid any soil drying before the first
true leaves begin expanding.
3. Reduce the watering frequency to as low as once or twice per
week as needed after plants have developed true leaves and
until the plants flower, to avoid water stress, but allow proper
drainage of the soil (see Note 21).
4. Water daily during silique filling stage for good seed production.
The water requirement of plants increases dramatically during
this stage.
Handling Arabidopsis Plants
11
5. Keep plants spaced apart with good air circulation to prevent
the incidence of powdery mildew.
6. Place several yellow or blue sticky cards (e.g., Pest Trap™) in
the growth area to monitor insect populations. Inspect cards
and plants daily for pests. Change cards periodically to better
judge the pest populations and especially after a pesticide
application.
7. Prevent the introduction and spread of pests, which can be
transported to the growth area via the soil, seeds, plants, or by
humans. Wear a lab coat especially assigned to the growth area,
since insects and pathogens can readily be transported on
clothing. Plan to have plants of similar age in the growth area,
since mature plants are more susceptible to pests than very
young plants. Any person who has been in infested growth
areas should subsequently abstain from entering noninfested
areas; when entering multiple areas, entries should be from the
cleanest to the more infested. Keep the area clean and regularly
sweep the floors and/or shelves to eliminate or reduce potential sources of pest outbreaks. Mature and dry plants should be
harvested and old soil and nonviable dry plant debris should be
discarded immediately.
8. Avoid infestation of pests like thrips, aphids, fungus gnats, and
white flies by spraying plants with a preventive mixture of Enstar®
II, and Conserve® SC. Insecticide mixture is prepared by adding
1.2 mL of each to 12 L of water. This mix can be sprayed lightly
on rosettes prior to bolting stage, before placement of any isolation devices (see Subheading 3.3.1). Marathon® 1G, a granular
insecticide, can also be applied as directed by the label to control
aphids, fungus gnat larvae, white flies, psyllids, and thrips (see
Notes 22 and 23).
3.2.1 Maintenance
of Plants in Greenhouses
Greenhouses with satisfactory cooling, heating, and supplemental
light are suitable for large-scale growth of lines that do not require
strict control of environmental conditions, which include most
natural accessions (e.g., Col, Ler, Cvi, Ws, Est, Kas, Sha, Kondara,
C24) as well as species related to A. thaliana. However, conditions
are often too hot in temperate climates for Arabidopsis growth in
greenhouses during the summer. Successful plant growth should
start with an empty room, cleaned and maintained as follows
(see Note 24):
1. Remove and properly discard all plants and other materials in
the room.
2. Sweep and hose down the entire room interior (benches,
floors, window ledges, and windows).
3. Increase the temperature in the room to 40 °C for 3–5 days.
The temperature setting may be higher, depending on the
12
Luz Rivero et al.
outside environmental conditions and equipment specification.
Lights, fans, and cooling pads should be turned off and vents
closed during this period.
4. Do not place diseased or older plants in the clean room after
the high temperature treatment.
5. Provide supplemental evening and morning light during the
winter, since the plants generally require a long photoperiod
(at least 12 h) for flowering. In the greenhouse, 16-h photoperiods are typically employed (see Notes 6 and 7).
6. Use shade cloth during the summer, which helps reduce light
intensity and regulate temperature.
7. The recommended growth temperature in the greenhouse is
21–23 ºC (see Note 8). Night temperatures should be maintained 2–4 °C lower than the day temperature.
3.2.2 Maintenance
of Plants in Growth
Chambers and Growth
Rooms
Most of the commercial growth chambers precisely control light
intensity, photoperiod, temperature (typically ±1 °C), and often
humidity. Custom plant growth rooms provide environmental
control similar to that of reach-in chambers. Standard architectural
rooms, equipped with supplemental lighting and air conditioning,
are popular for reproducing Arabidopsis economically. Such rooms
must be designed with sufficient light, cooling, and ventilation,
but typically afford less rigorous control of growth conditions than
custom chambers. Such facilities usually allow better control of
temperature and light than is offered by a greenhouse, hence their
popularity among Arabidopsis researchers. Growth rooms can be
maintained within 2–3 °C of a set point, while greenhouse temperatures may spike to higher deviations with rapid changes in sunlight, unexpected hot days, etc. As is the case for greenhouses, it is
imperative to start a new planting in a growth facility that has been
previously emptied and properly cleaned. Hence, the use of chemicals to control pests and loss of plants due to pest infestation is
minimized.
1. Remove and discard all plant residues and related materials
(see Note 24).
2. Sweep and wipe down the interior with wet paper towel.
3. Make sure the intake and exhaust vents are closed.
4. Apply a sterilizing agent, such as Spor-Klenz®, to kill fungal
spores if heavy infestation of powdery mildew was present,
using a fogger tank (e.g., Tornado™/Flex cold fog ULV mist
sprayer) through an external access port of the chamber
( see Note 25).
5. Leave chamber undisturbed overnight, and wipe down the
inside of the growth chamber with a wet paper towel the
next day.
Handling Arabidopsis Plants
13
6. Increase the temperature to 40–45 °C for a period of 3–5 days
to eradicate/minimize pests.
7. Do not place diseased or older plants in the cleaned chamber.
8. Use continuous light or a long-day photoperiod if you wish to
accelerate the reproductive cycle. Short days (less than 12 h)
favor growth of vegetative tissue and delay flowering.
3.2.3 Monitoring
the Environmental
Growth Conditions
The environmental control systems currently offered for greenhouses, growth chambers, and growth rooms allow for remote
monitoring, control adjustment, and alarm notification via Internet
connections. These features represent a vital tool for avoiding loss
of data during plant production and maintaining control of environmental experiments. Installation of remote sensing is recommended for new growth facilities of all types.
In addition to the control and logging systems in place at the
growth facilities, environmental growth conditions can be
monitored by placing portable data loggers (e.g., the HOBO®
U14) in growth areas. They can act as a complementary, backup,
or sole resource for recording environmental data. They can be
used to display and record temperature and relative humidity
conditions in greenhouses, growth chambers, growth rooms, cold
rooms, dry rooms, and laboratories. These data loggers offer reliability, accuracy, convenient monitoring, and documentation of
specific environmental conditions. They can be connected to a
computer to quickly display and analyze data.
3.3
Prevention of cross-contamination among adjacent pots and avoiding
the loss of seeds due to shattering are equally important. Plants
must be isolated from their neighbors without compromising seed
quality. Various methods and devices exist to accomplish these
objectives, including Aracons™, plastic floral sleeves, plastic bags,
and isolation by space on the open bench. Details of each method
are described below:
Seed Handling
3.3.1 Plant Isolation,
Harvesting, and
Preparation for Storage
1. Aracons™: Place Aracons™ over single plants soon after bolting.
2. Floral sleeves: Cut four equally spaced holes at the point where
the sleeve meets the top of the pot. This will increase aeration
and reduce water condensation that may encourage mold
growth. Place the sleeve on the pot near the time of bolting, so
that all plant inflorescences are maintained within the sleeve
(see Note 26). This method is very effective for achieving high
densities while maintaining productivity and purity of single
lines of different genetic backgrounds.
3. Plastic bags: If plastic bags are used, train inflorescences of
non-erecta lines into a 4–8-L transparent plastic bag before
siliques begin to brown. Bags should be kept open to avoid the
accumulation of moisture resulting from transpiration.
14
Luz Rivero et al.
4. Open bench growth: Plants can be maintained on the open bench
for bulk seed production, keeping all lines separated by adequate
space. Avoid disturbance of maturing inflorescences. This method
is appropriate when growing natural accessions that are late
flowering and develop large and dense canopies (e.g., Sij-1,
Monte-1, Amel-1, Anholt-1, Appt-1, Bik-1, Bl-1, Do-0).
The simplest procedure is to wait until the entire inflorescence
has browned before harvesting. However, some siliques may shatter naturally and seed will be lost. Harvest seeds only after the soil
in pots or flats has been allowed to dry. It should be noted that
delays in harvesting following physiological maturation of the
plant result in seed deterioration, especially under nonoptimal
environmental conditions. Seeds from individual siliques can be
harvested after the fruits have turned completely yellow, if rapid
turnover is required. However, such seeds have high levels of germination inhibitors. Since formation and maturation of siliques
occur over time, early siliques can be harvested before later ones
mature. Harvest for each of the four isolation methods is as
follows:
1. Aracons™: Slide the plastic cylinder off and then cut off the
dry inflorescence above the cone device in a threshing sieve.
2. Floral sleeves: While holding the pot, cut away and discard
plastic sleeve. Cut the dry inflorescences and place them in a
threshing sieve.
3. Plastic bags: Cut the entire plant off at its base. Shake the seeds
into the bag; inflorescences can be gently handpressed from
the outside, and the seeds will fall to the bottom of the bag.
Most of the dry inflorescences can be removed from the bag by
hand before seeds are sieved to separate them from chaff.
4. Open bench: Cut off the entire inflorescence at its base, and
carefully place into a 4–8 L or larger transparent plastic bag,
depending on the size of the bulk of plants.
The major factors influencing seed longevity are (1) genotype; (2) environmental conditions during seed maturation, harvesting, and seed handling; and (3) seed storage conditions.
Harvested seeds should be processed promptly (including threshing,
cleaning, drying, and packaging) and placed into storage.
Seeds should be threshed when the seed moisture content is
approximately 10 %, to minimize seed damage during threshing.
This seed moisture content will be reached when all plant material
appears to be dry. Hand, rather than machine threshing, is recommended mainly because threshing machines need rigorous cleaning between lines to avoid sample cross-contamination, require
very careful adjustment, and do not accommodate the variable size
Handling Arabidopsis Plants
15
of Arabidopsis seeds well. The hand method is performed as
follows:
1. Set a large, clean, white paper on a bench or table for collection
of the threshed seeds.
2. Place a clean threshing sieve on top of the paper.
3. Place dry plants directly onto the sieve. If plants are larger than
the sieve, they can be cut into pieces that fit the screen.
4. Crush plants using hands to remove all the seeds from siliques.
Discard plant material.
5. Sieve seeds through the mesh repeatedly until they are clean
and free of chaff. After sieving, the seeds are still likely to be
mixed with soil and plant residue. A combination of additional
sieving, gentle blowing, and visual inspection can be employed
to clean the seeds completely.
6. Clean small samples by hand with the aid of a pointed tool on
an opaque glass plate illuminated from below, if needed.
7. Place cleaned seed samples in small labeled manila envelopes or
open glass jars to allow seeds to air-dry. Do not use plastic due
to static effects.
The ideal moisture content of seeds for storage is 5–6 %.
Higher moisture content can cause seed deterioration. There are
many methods available for drying seeds. The recommended
method is to air-dry the seeds at room temperature and approximately 20 % relative humidity for 1–3 weeks (see Note 27). Low
relative humidity (20–30 %) is necessary for seeds to reach the
desired moisture content [9, 10]. Seed moisture content can be
determined by several methods [11]. Seed packaging for storage
can be accomplished as follows:
1. Use cryovials (with threaded lids and gaskets) for convenient
and safe storage. They hold large numbers of seeds, seal tightly,
are moisture proof, and can be resealed many times.
2. Label each vial with pertinent information including date of
storage.
3. Determine stored seed quantities (approximately 50 μL =
25 mg = 1,250 seeds).
3.3.2 Seed Storage
and Preservation
The general conditions for preserving optimal viability of seeds
have been well defined [9, 10, 12–14]. Seed storage principles for
Arabidopsis are similar to those for other plants, with the caveat
that the small seeds rehydrate very rapidly if exposed to high
humidity. When seeds deteriorate, they lose vigor and eventually
the ability to germinate. The rate of this “aging” is determined by
interactions of the temperature and moisture content at which
seeds are stored, and unknown cellular factors that affect the propensity for damage reactions [9].
16
Luz Rivero et al.
Rapid deterioration of seeds has not been observed for the
diverse collections currently maintained at ABRC. However, experience regarding the effect of genotype is limited. A large number of
genes involved in embryogenesis, reserve accumulation, and seed
maturation have been identified. Conspicuously, seeds of the abscisic
acid-insensitive mutants fail to degrade chlorophyll during maturation and show no dormancy, leading to low desiccation tolerance and
poor longevity [15]. Arabidopsis seeds should retain high viability
for long storage periods, under proper conditions. With the
increase of storage temperature and seed moisture content, the life
span of the seeds decreases. Seeds left at room temperature and
ambient relative humidity lose viability within approximately
2 years. Seed stored dry at 4 ºC or −20 ºC should last decades.
Below are three storage options for safe seed preservation:
1. For active collections which are accessed often, store seeds at
4 °C and 20–30 % relative humidity. Control of humidity is
typically achieved by a dehumidification system in the cold
room. Note that the control of relative humidity provides a
safety factor in case seed containers are not sealed properly.
2. For long-term or archival storage, the recommended temperature is subzero, preferably −20 °C and also preferably 20 %
relative humidity.
3. For open containers such as envelopes, seeds can be stored at
15–16 °C, with a relative humidity maintained very carefully at
15 %. Under this controlled environment, seeds will maintain
suitable low moisture content [16]. Storing seeds at relative
humidity <15 % will not increase shelf life and may actually
accelerate deterioration [10].
When vials are removed from cold storage, condensation of
moisture on the seeds and subsequent damage may occur. For vials
stored at 4 °C, sealed vials must always be warmed to room temperature before opening. For vials stored at −20 °C, rapid rewarming
(placing the sealed vial in a 37 °C water bath for 10 min) is a recognized method to minimize frost damage. If possible, working
with seed stocks should take place at low (20–30 %) relative humidity.
If accumulation of condensation is suspected, vials should be left
open in the dry room until seeds have equilibrated before returning
the vials to cold storage.
3.3.3 Seed Quality
Control
The purity and physical integrity of seeds and the presence of pests
and seed-borne diseases (especially some fungal diseases) can be
detected by visual examination with the naked eye, magnifying
lenses, or using a dissecting microscope. For a rigorous assessment,
spread the seeds on white paper under a well-lit microscope.
Generally, gray or white coloration on the seed surface indicates fungal contamination. Discard seeds if possible; otherwise, sterilize
seeds with fungicides before planting. Do not discard shriveled,
Handling Arabidopsis Plants
17
small, irregular-shaped, and other colored seeds that might correspond to specific mutations, assuming that the seeds were produced
under optimal conditions.
Seed viability should be monitored at regular intervals by conducting germination tests under a standard set of conditions. It is
recommended that seeds in long-term storage under the optimal
preservation standards should be monitored at least every 10 years.
Seeds in short-term storage should be monitored at least every 5
years [12, 14].
A germination test for Arabidopsis can be conducted in 3–7
days to determine the proportion of seeds in a sample that will
produce normal seedlings. Tests should be carried out before seeds
are stored, so that poor quality samples can be recognized.
Arabidopsis seeds may fail to germinate because they are dormant
or because they are defective or nonviable. Dormant seeds can be
distinguished because they remain firm and in good condition,
while nonviable seeds soften and are attacked by fungi. Extending
stratification can usually break dormancy (see Note 4).
Initial germination rate should exceed 80 %, but may be lower
for some lines. Mutations in a significant number of genes, mostly
involved in biosynthesis and signaling pathways of certain hormones, affect seed germination and/or dormancy. A germination
test can be performed as follows:
1. Label the bottom of a 10-cm-diameter Petri plate with name
and date.
2. Place two layers of filter paper in the bottom of the plate and
moisten with distilled water. Remove excess water.
3. Distribute 100 seeds evenly on the surface of the paper. Seal
the plate with Parafilm or clear tape, to prevent drying.
4. Stratify seeds by placing the plates at 4 °C for 3 days.
5. Move the plates to an illuminated shelf or to a growth chamber
under standard light and temperature conditions (see Note 28).
6. Record germination percentage after 3–7 days by dividing the
number of seedlings by the total number of seeds and multiplying by 100.
Germination tests can also be performed on solid media, such
as MS, described in Subheading 3.1.1.
3.4
Genetic Crosses
Some species of Arabidopsis, particularly A. thaliana, are mostly
self-pollinating, especially in a growth chamber or greenhouse
setting where insect populations are minimized [17]. It should be
noted that the pollen of Arabidopsis does not disperse through the
air. Therefore, crossing Arabidopsis is mainly conducted through
manual emasculation of flowers just prior to flower opening,
followed by hand transfer of pollen from the desired male parent
to the stigma of the emasculated flower. Although labor intensive,
18
Luz Rivero et al.
the manual method remains a reliable technique for achieving
cross-pollination.
Species, such as Arabidopsis halleri and Arabidopsis lyrata, have
natural self-incompatibility mechanisms, which prevent the plant
from self-pollinating and result in obligate outcrossing [18]. For
such species, simple maintenance of a genetic stock cannot easily
be accomplished from a single plant, and it is most convenient to
start with a small population of founders and perform crosspollination. The manual techniques for performing genetic crosses
of A. thaliana can be generalized to the related species. The use of a
magnifying visor or dissecting microscope is recommended to visualize floral parts and avoid damage to the pistil. Genetic crosses can
be performed as follows:
1. Select the appropriate parent plants. Choose young plants at
early stages of flowering. Avoid using the first flowers in the
inflorescence, which are usually less fertile, and the smaller
flowers produced by mature plants [19] (see Note 29).
2. Prepare the female parent:
(a) Select a stem with two to three flower buds, in which the
tips of the petals are barely visible and before the anthers
begin to deposit pollen on the stigma (see Note 30).
(b) Remove siliques, leaves, and any open flowers above and
below the selected buds on the chosen stem with a small
pair of scissors; avoid damaging the stem.
(c) Remove the sepals, petals, and all six stamens from the
selected flower buds using the precision clamping tweezers,
leaving the pistil intact (see Note 31).
3. Prepare the male parent: Select a newly opened flower with
anthers that are dehiscent. These flowers will contain fresh
pollen that will contribute to the success of the cross. Remove
the flower by squeezing near the pedicel with tweezers.
4. Pollinate the female parent by taking the fully open flower
from the male parent and brushing the anthers over the bare
stigma of the female parent. Visually confirm that pollen has
been deposited on the stigma.
5. Label the crosses, placing tape on the stem of the female plant,
noting the male and female parent and the date of the cross.
6. Inspect developing siliques over the next several days. Successful
crosses are visible after 3 days when the siliques start elongating.
Siliques are ready for harvest once they turn brown, but before
they shatter (see Note 32).
7. Harvest siliques by cutting them with scissors and placing
them into a microcentrifuge tube or a small paper envelope.
8. Air-dry seeds at room temperature, preferably at 20–30 % relative
humidity, for 1–3 weeks. Thresh seeds if necessary.
Handling Arabidopsis Plants
3.5 Floral Dip
Transformation of
Arabidopsis with
Agrobacterium
tumefaciens
19
The development of simple and highly efficient stable transformation protocols, without a need for plant regeneration in tissue culture, represented one of the milestones that enabled Arabidopsis to
become a model that it is today. Transformation of germinating
seeds with Agrobacterium tumefaciens represented the first breakthrough in this effort [20]. It was followed by a “vacuum infiltration” method, in which Agrobacterium was used to infect uprooted
flowering plants [21]. This protocol was simplified and streamlined
a few years later and became known as the “floral dip” method
[22]. In this method, the need for vacuum infiltration was
replaced by the use of Silwet L-77®, a surfactant that aids the
entry of bacteria into plant tissues. The use of this protocol revolutionized the field of Arabidopsis functional genomics, by enabling
high-throughput generation of T-DNA mutants and other
resources that show stable inheritance of the mutations and other
modifications caused by transformation events. Although other
methods are still in use in specific cases (e.g., transformation of
root explants for transforming sterile mutants [23] or vacuum
infiltration for Ler-0 [21]), floral dip has become the most widely
used protocol in most of the research labs and for most of the natural accessions (e.g., Col-0, Ws-0, Nd-0, No-0) and will be described
here in detail:
1. Grow plants in pots as described in Subheading 3.2 under
long-day conditions until bolting (see Note 33).
2. Remove the first inflorescence stems that bolt to induce growth
of secondary shoot inflorescences. Plants will be ready for dipping in 5–7 days.
3. Prepare the starter culture of Agrobacterium carrying the construct of interest, by growing the 5-mL culture in LB medium
supplemented with appropriate antibiotics at 28 ºC for 2 days
(see Note 34). The culture should be started 2–4 days after the
first inflorescences have been removed (step 2).
4. Use 1 mL of the starter culture to inoculate 200 mL of LB
medium supplemented with appropriate antibiotics and grow
this large culture for 16–24 h, until the cell growth reaches the
stationary phase (see Notes 35 and 36).
5. Spin down Agrobacterium culture at 4,000 × g for 10 min at
room temperature and resuspend the pellet in 1–2 volumes
(200–400 mL) of 5 % sucrose solution (see Note 37).
6. Immediately before dipping, add the appropriate volume of
Silwet L-77® to the Agrobacterium cell suspension, to make a
final concentration of 0.02–0.05 %; pour the suspension in a
beaker.
7. Prepare the plants for dipping by removing the siliques that
have already been formed (see Note 38).
20
Luz Rivero et al.
8. Dip the inflorescences for a few seconds by holding the pot
with one hand and gently bending the inflorescence shoots to
allow them to be completely submerged into the suspension
until a film of suspension can be observed on the plants.
9. Moisten paper towel and place it at the bottom of a tray. Lay the
pots on their sides in the tray and cover with a lid (see Note 39).
10. After 1 day, place the pots in their normal upright position and
continue growing the plants until they set seeds.
11. Screen the primary transformants on appropriate selection
plates.
4
Notes
1. Optional sucrose and vitamins should be added after autoclaving and only after the agar media cools because vitamins are
thermolabile and 15–25 % of the sucrose may be hydrolyzed to
glucose and fructose at elevated temperatures [24].
2. Plants grow more vigorously and quickly on media containing
1–2 % of sucrose; however, fungal and bacterial contamination
must be rigorously avoided by seed sterilization. Note that
germination of some mutants might be delayed on sucrosecontaining media.
3. Covered plates, boxes, or tubes with solidified agar can be
stored for several weeks at 4 °C in a container that prevents
desiccation.
4. Most widely used lines have moderate dormancy, and cold
treatment, also called stratification, may not be required for
germination when planting older seeds of these lines. However,
a cold treatment at 4 °C for 3 days will improve the rate and
synchrony of germination. The use of an extended cold treatment of approximately 7 days is especially important for freshly
harvested seeds, which have more pronounced dormancy. An
extended cold treatment is also necessary for certain natural
accessions (e.g., Dobra-1, Don-0, Altai-5, Anz-0, Cen-0,
WestKar-4). Cold treatment of dry seeds is usually not effective
in breaking dormancy.
5. Instead of stratification on plates, seeds suspended in sterile
water can also be stratified prior to planting on agar or soil
surface.
6. Optimum light intensity is in the range of 120–150 μmol/
m2 s. Higher intensities may result in death of some seedlings,
but are tolerated by older plants; purpling of leaves is the first
symptom of high-light stress. Very low light intensities may
result in weak and chlorotic plants. Arabidopsis is a facultative
Handling Arabidopsis Plants
21
long-day plant. Plants flower rapidly under continuous light or
long-day (>12 h) photoperiods, while under short days
(<12 h), flowering is delayed, favoring vegetative growth.
Plants grow well under a cycle of 16-h light/8-h dark or under
continuous light.
7. Various light sources can be used for optimal plant growth,
such as cool-white fluorescent bulbs, incandescent bulbs, very
high-output (VHO) lamps, high-intensity discharge (HID)
lamps, and shaded sunlight. Cool-white fluorescent bulbs,
supplemented by incandescent lighting, are recommended in
growth chambers or growth rooms. HID lamps of 400–
1,000 W are conventional in greenhouses in temperate
climates to supplement the sunlight or prolong the natural
photoperiod.
8. The temperature range for Arabidopsis growth is 16–25 °C.
Lower temperatures are permissible, but higher temperatures
are not recommended, especially for germination through
early rosette development. Temperatures above 28 ºC are
better tolerated by more mature plants (past early rosette
stage). In general, high temperatures result in a reduced number of leaves, flowers, and seeds. At lower temperatures, growth
is slow, favoring the vegetative phase, and flowering is delayed.
9. Thiamine and 2,4-D stock solutions must be added to the
media in a laminar flow hood to prevent contamination of
stock solutions.
10. Some investigators prefer to use the disposable 250-mL polycarbonate membrane vented-cap flasks that may provide better
aeration and result in a better cell growth.
11. Mix the culture well immediately before pipetting, since the
cells settle to the bottom of the flask shortly after the orbital
shaking has been stopped.
12. The density of cell culture is an important factor for its viability. Too high and too low density can cause cell death and the
cessation of cell division, respectively. Adjust the volume of
subcultured cells if necessary. If larger clumps of cells are
formed, pass the suspension through a sterile 1-mm stainless
steel sieve.
13. T87 cell culture can also be propagated and maintained as a
callus on solid NT-1 media (on plates or in Magenta® boxes to
avoid premature depletion of nutrients). Callus is subcultured
once a month by transferring a 1-mm piece to fresh media.
Note that growth of the callus after continued passage becomes
independent of cytokinin [25].
14. Square pots with a diameter of approximately 5.5 cm can be
used to grow one plant, 11-cm-diameter pots are suitable for
22
Luz Rivero et al.
growing up to 60 plants, and rectangular flats that are
26 cm × 53 cm can accommodate as many as 200–600 plants
grown to maturity. Another option especially suitable for
genomic studies is 96-well insets. Higher densities, approximately 3,000 plants per 30 cm2, can be used if plants are harvested at early stages.
15. Osmocote® 14-14-14 (14 % nitrogen, 14 % phosphate, 14 %
potassium) is an extended time-release fertilizer, feeding up to 3
months from planting. Apply in amounts according to the label.
Alternatively, nutrient solution can be used to wet the soil [26].
16. Always use clean growth supplies, especially new pots and trays
to avoid pest contamination.
17. Prepared pots can be stored in covered trays at 4 °C for several
days before planting, although pot preparation and planting
should be conducted on the same day if possible.
18. Various methods can be employed to plant seeds. The density of
plants varies with genetic circumstances and purpose of the
planting. High yields are achieved with 10–20 plants per 11-cmdiameter pot. Generally, low densities increase the yield/plant
and are suitable for pure lines. High densities reduce the yield/
plant, but are useful when it is necessary to maintain the genetic
representation in segregating populations.
19. The plastic wrap should not be allowed to contact the soil
surface and should be perforated to provide aeration. If clear
plastic domes are used, they should not be tightly sealed.
20. Some winter-annual natural accessions require a period of
cold to initiate flowering, a process known as vernalization
(e.g., Galdo-1, Monte-1, Cit-0, Dog-4, Istisu-1, Valsi-1, Mir0, Tamm-2). Young rosettes (2–4 weeks old) of late flowering
accessions should be placed at 4 °C for 4–7 weeks to accelerate
flowering.
21. Plants should not be overwatered to avoid development of
algae, fungi, fungus gnat larvae, and other pests who thrive on
overly wet soil. Algae can be manually scraped off and the soil
allowed to dry.
22. Preventive application of pesticides is very effective if local
regulations allow this and can avert heavy use of chemicals
after infestations have developed. Rotation of pesticides is recommended. Biological control agents can also be applied.
23. Marathon® 1G is effective as a preventive insecticide or as
treatment following infestation. It can be applied to the soil
surface or included in subirrigation watering regime, which
reduces damage to the plants.
24. Mature or diseased plants, plant debris, used soil, pots, and
other materials can shelter pathogen spores or insects from
Handling Arabidopsis Plants
23
former plantings. After removal of the pest and host materials
and the sterilization of the growth area, it is very improbable
that any pest or pathogen will survive.
25. Read and follow precautionary measures as suggested by the
manufacturer of the cold sterilant Spor-Klenz®.
26. Floral sleeves fit snugly around a pot, extend upward, and are
wider at the top allowing for expansion of the developing inflorescences. Sleeves made of biaxially oriented polypropylene
(BOPP) are very clear, maintain upright stiffness, and tear easily for harvesting. Fold down the tops of the sleeves about
2 cm to ensure they stay open and stable. If plants grow out
above the sleeves and are at high plant density, train the top of
the plants back down into the sleeve to avoid contamination.
27. The moisture content of Arabidopsis seeds stored in open containers corresponds to the room humidity. Arabidopsis seeds
behave in a similar way to crop seeds with similar chemical
composition [12, 13].
28. Environmental conditions for seed germination tests are the
same as for growing plants. Two replicates of 100 seeds each
provide reliable germination estimates. Cases in which observed
germination is <80 % may warrant follow-up testing.
29. Crosses may be performed throughout the duration of the
flowering time; however, the crosses will have a higher rate of
success during the earlier stages of flowering.
30. Using unopened flowers for the female parent is important in
order to avoid self-pollination. Shortly after this stage, stamen/pistil length ratio, as well as the timing of anther dehiscence, favors self-pollination and open flowers have most likely
been self-pollinated. All flower candidates for female crossing
should be examined for presence of released pollen prior to
their use in crossing.
31. If the pistil is damaged, it is highly unlikely that the cross will
be successful and the flower should not be used.
32. Siliques should be ready to harvest in about 2–3 weeks after
the cross. If siliques are brown, use care, as it is easy to lose all
seeds at this stage.
33. Plants can be either grown individually in 5-cm round pots or
up to the density of 10–15 plants per 11-cm square pot [27].
34. LB medium can be substituted with Yeast Extract Peptone
(YEP) medium to achieve higher Agrobacterium density [27].
35. The presence of the construct should be confirmed in the
starter culture (e.g., by PCR).
36. The rest of the starter culture can be stored at 4 ºC for up to
1 month for future use [28].
24
Luz Rivero et al.
37. One to two volumes of sucrose solution used to resuspend the
pellet corresponds to the original volume of the large
Agrobacterium culture. The final OD600 of the suspension for
dipping should be approximately 0.8.
38. Removing siliques will increase the transformation efficiency.
39. Some investigators prefer using another tray in place of a lid, to
avoid the exposure to light and the production of excessive
heat around the plants.
References
1. Knee E, Rivero L, Crist D, Grotewold E,
Scholl R (2011) Germplasm and molecular
resources. In: Schmidt R, Bancroft I (eds)
Plant genetics and genomics: crops and models, vol 9, Genetics and genomics of the
Brassicaceae. Springer, New York, pp 437–467
2. Murashige T, Skoog F (1962) A revised medium
for rapid growth and bio assays with tobacco
tissue cultures. Physiol Plant 15:473–497
3. Axelos M, Curie C, Mazzolini L, Bardet C,
Lescure B (1992) A protocol for transient
gene-expression in Arabidopsis-thaliana protoplasts isolated from cell-suspension cultures.
Plant Physiol Biochem 30:123–128
4. Yamada H, Koizumi N, Nakamichi N, Kiba T,
Yamashino T, Mizuno T (2004) Rapid response
of Arabidopsis T87 cultured cells to cytokinin
through His-to-Asp phosphorelay signal transduction. Biosci Biotechnol Biochem 68:
1966–1976
5. Nakamichi N, Matsushika A, Yamashino T,
Mizuno T (2003) Cell autonomous circadian
waves of the APRR1/TOC1 quintet in an
established cell line of Arabidopsis thaliana.
Plant Cell Physiol 44:360–365
6. Alonso AP, Piasecki RJ, Wang Y, LaClair RW,
Shachar-Hill Y (2010) Quantifying the labeling
and the levels of plant cell wall precursors using
ion chromatography tandem mass spectrometry. Plant Physiol 153:915–924
7. Ogawa Y, Dansako T, Yano K, Sakurai N,
Suzuki H, Aoki K, Noji M, Saito K, Shibata D
(2008) Efficient and high-throughput vector
construction and Agrobacterium-mediated
transformation of Arabidopsis thaliana
suspension-cultured cells for functional genomics. Plant Cell Physiol 49:242–250
8. Menges M, Murray JA (2002) Synchronous
Arabidopsis suspension cultures for analysis of
cell-cycle gene activity. Plant J 30:203–212
9. Walters C (1998) Understanding the mechanisms and kinetics of seed aging. Seed Sci Res
8:223–244
10. Walters C (1998) Ultra-dry seed storage. Seed
Sci Res 8:1–73
11. Rivero-Lepinckas L, Crist D, Scholl R (2006)
Growth of plants and preservation of seeds. In:
Salinas J, Sanchez-Serrano JJ (eds) Methods in
molecular biology, vol 323, Arabidopsis protocols. Humana, Totowa, NJ, pp 3–12
12. Rao NK, Hanson J, Dulloo ME, Ghosh K,
Nowell D, Larinde M (2006) Manual of seed
handling in genebanks. Bioversity International,
Rome
13. Hong TD, Ellis RH (1996) A protocol to
determine seed storage behavior. IPGRI, Rome
14. FAO and IPGRI (1994) Genebank standards.
FAO, IPGRI, Rome, pp 1–8
15. Ooms J, Leon-Kloosterziel KM, Bartels D,
Koornneef M, Karssen CM (1993) Acquisition
of desiccation tolerance and longevity in seeds
of Arabidopsis thaliana (a comparative study
using abscisic acid-insensitive abi3 mutants).
Plant Physiol 102:1185–1191
16. Hay FR, Mead A, Manger K, Wilson FJ (2003)
One-step analysis of seed storage data and the
longevity of Arabidopsis thaliana seeds. J Exp
Bot 54:993–1011
17. Koornneef M (1994) Arabidopsis genetics. In:
Meyerowitz E, Somerville C (eds) Arabidopsis.
Cold Spring Harbor Laboratory Press, Cold
Spring Harbor, NY, pp 89–120
18. Nasrallah J (2011) Self-incompatibility in the
brassicaceae. In: Schmidt R, Bancroft I (eds)
Plant genetics and genomics: crops and
models, vol 9, Genetics and genomics of
the Brassicaceae. Springer, New York,
pp 389–411
19. Weigel D, Glazebrook J (2002) Genetic analysis of mutants. In: Arabidopsis—a laboratory
manual. Cold Spring Harbor Laboratory Press,
New York, pp 41–53
20. Feldmann KA, Marks MD (1987) Agrobacteriummediated transformation of germinating-seeds of
Arabidopsis-thaliana—a
non-tissue
culture
approach. Mol Gen Genet 208:1–9
Handling Arabidopsis Plants
21. Bechtold N, Ellis J, Pelletier G (1993)
In-Planta Agrobacterium-mediated gene-transfer
by infiltration of adult Arabidopsis-thaliana
plants. C R Acad Sci III–VIe 316:1194–1199
22. Clough SJ, Bent AF (1998) Floral dip: a simplified method for Agrobacterium-mediated
transformation of Arabidopsis thaliana. Plant J
16:735–743
23. Valvekens D, Vanmontagu M, Vanlijsebettens
M (1988) Agrobacterium-tumefaciens-mediated
transformation of Arabidopsis-thaliana root
explants by using kanamycin selection. Proc Natl
Acad Sci U S A 85:5536–5540
24. Schenk N, Hsiao K-C, Bornman CH (1991)
Avoidance of precipitation and carbohydrate
breakdown in autoclaved plant tissue culture
media. Plant Cell Rep 10:115–119
25
25. Pischke MS, Huttlin EL, Hegeman AD,
Sussman MR (2006) A transcriptome-based
characterization of habituation in plant tissue
culture. Plant Physiol 140:1255–1278
26. Estelle MA, Somerville C (1987) Auxinresistant mutants of Arabidopsis thaliana with
altered morphology. Mol Gen Genet 206:
200–206
27. Weigel D, Glazebrook J (2002) Arabidopsis: a
laboratory manual. Cold Spring Harbor
Laboratory Press, Cold Spring Harbor, NY
28. Zhang X, Henriques R, Lin SS, Niu QW, Chua
NH (2006) Agrobacterium-mediated transformation of Arabidopsis thaliana using the floral
dip method. Nat Protoc 1:641–646
Chapter 2
Using Arabidopsis-Related Model Species (ARMS): Growth,
Genetic Transformation, and Comparative Genomics
Giorgia Batelli, Dong-Ha Oh, Matilde Paino D’Urzo,
Francesco Orsini, Maheshi Dassanayake, Jian-Kang Zhu,
Hans J. Bohnert, Ray A. Bressan, and Albino Maggio
Abstract
The Arabidopsis-related model species (ARMS) Thellungiella salsuginea and Thellungiella parvula have
generated broad interest in salt stress research. While general growth characteristics of these species are
similar to Arabidopsis, some aspects of their life cycle require particular attention in order to obtain healthy
plants, with a large production of seeds in a relatively short time. This chapter describes basic procedures
for growth, maintenance, and Agrobacterium-mediated transformation of ARMS. Where appropriate, differences in requirements between Thellungiella spp. and Arabidopsis are highlighted, along with basic
growth requirements of other less studied candidate model species. Current techniques for comparative
genomics analysis between Arabidopsis and ARMS are also described in detail.
Key words Thellungiella spp., Halophytes, Germination, Seed handling, Vernalization, Plant care
1
Introduction
Over the past few decades, a tremendous advance in our understanding of molecular and cellular responses to abiotic stresses has
taken place using the model species Arabidopsis. Forward and
reverse genetics approaches, combined with thorough functional
analysis of many isolated genes, as well as biochemical characterization of key stress tolerance proteins have allowed us to characterize
quite accurately many responses to salt and osmotic stresses
(reviewed in refs. 1–3).
Although Arabidopsis has contributed to the unraveling of
complex essential mechanisms that allow plants to cope with salt
stress [4], it has failed to reveal the key determinants that render in
natural environments some plants (halophytes) more tolerant than
others (glycophytes) to saline environments. Halophytes from different families have been studied over the past decades, including
Jose J. Sanchez-Serrano and Julio Salinas (eds.), Arabidopsis Protocols, Methods in Molecular Biology, vol. 1062,
DOI 10.1007/978-1-62703-580-4_2, © Springer Science+Business Media New York 2014
27
28
Giorgia Batelli et al.
species, for example, belonging to the genera Atriplex, Suaeda,
Salicornia, and Mesembryanthemum; monocotyledonous species
such as Spartina and Puccinellia spp.; and mangroves belonging
to the genera Avicennia and Rhizophora ([5], reviewed in ref. 6).
The study of halophytic species has led to a partial understanding
of the different physiological and morphological strategies used
by plants to withstand harsh conditions. However, the paucity of
suitable molecular genetics techniques has, to a great extent, prevented the identification of the genetic bases for salt tolerance in
halophytes [6]. Genetic studies on halophytic species are very limited [7], and the potential of this resource of natural salt tolerance
has remained largely unexplored [6, 8–11]. Recently, Thellungiella
salsuginea (salt cress), previously referred to as Thellungiella
halophila, and its close relative Thellungiella parvula [12, 13] have
been proposed as model systems for the study of halophytic traits
[14–17]. Compared to other halophytes, Thellungiella spp. exhibit
lower levels of tolerance but may still be considered true halophytes [6]. The relatively short life cycle and other traits important
to efficient experimentation together with their close relatedness to
Arabidopsis (92 % of average sequence identity with Arabidopsis
thaliana for T. salsuginea) have made them preferred species as
extremophile plant model systems [11, 14, 15, 18]. Since the initial introduction of T. salsuginea as model system [14, 15, 18],
remarkable progress has been made in the elucidation of morphological, physiological, and molecular traits that differentiate this
species from the close relative Arabidopsis [11]. Such distinctive
traits include more succulent and waxy leaves [15, 19, 20], the
presence of extra layers of leaf palisade cells and root endodermis
and cortex layers compared to Arabidopsis [19], a higher content
of compatible osmolytes in both control and salt stress conditions
[11, 19, 21], and a higher capability to efficiently restrict the Na+
influx into the roots [11, 22, 23]. Additional distinctive mechanisms
of protection from excess salt in Thellungiella may include a more
efficient regulation of Na+ fluxes at both plasma membrane [24, 25]
and tonoplast levels [26]. These features indicate that Thellungiella
is preadapted and therefore “more prepared” to efficiently tailor its
response to salt stress.
The availability of these resources, coupled with the feasibility
of forward and reverse genetics studies in Thellungiella spp. which
can be compared to its close genetic relative (Arabidopsis thaliana), has certainly opened new avenues towards a better understanding of the fundamental mechanisms of plant salt tolerance.
1.1 Growth
and Maintenance
of Thellungiella spp.
T. salsuginea and T. parvula are very similar to Arabidopsis in terms
of growth and maintenance requirements, and they can easily be
grown in growth chambers and greenhouses. Compared to
Arabidopsis, however, the life cycle of Thellungiella spp. is longer
and, for T. salsuginea, a long vernalization period is required for
Arabidopsis Related Model Species (ARMS) in Salt Stress Research
29
flowering. Seed maturation in these species is more asynchronous
than Arabidopsis, therefore, extra care should be taken in experimental procedures such as the recovery of transformants which
may not be included in the initial wave of germination.
Subheading 3.1 describes procedures of growth and maintenance
of Thellungiella spp. that are critical for obtaining healthy plants
and high-quality seeds. Less studied halophytic species can also be
considered as valuable model systems [16]. For these, basic growth
requirements are briefly presented in Subheading 3.2.
1.2 Genetic
Transformation Using
Agrobacterium
As A. thaliana, Thellungiella spp. can be efficiently transformed via
Agrobacterium-mediated T-DNA transfer using the simple,
straightforward method of the floral dip [27] or, similarly, by
spraying flowers with an Agrobacterium suspension ([14, 15, 19],
Paino D’Urzo and Bressan, unpublished). However, the prolonged
asynchronous flowering process in Thellungiella spp. requires
several repeated rounds of transformation in order to ensure a high
percentage of transformants. Subheading 3.3 of this chapter
describes a method for large-scale Agrobacterium-mediated transformation to generate collections of T-DNA insertional mutants.
1.3 Comparative
Genomics of ARMS
Since the completion of its genome in 2000 [28], vast amounts of
genetic data have been accumulated and analyzed for A. thaliana.
This makes the genomes and transcriptomes of Arabidopsis ecotypes and Arabidopsis-relative crucifers particularly suitable
resources for comparative studies. Several comparative gene expression analyses using ESTs produced from plants exposed to various
stress conditions, quantitative real-time PCRs, and different types
of microarrays have confirmed [10, 21, 29–32] the presence of
potentially important and distinctive paralogs that may mediate
mechanisms of stress adaptation in Thellungiella [11].
Recent advances in next-generation sequencing technology have
enabled and accelerated the sequencing and assembly of the genomes
of non-model species, including crucifers. In 2011, the genomes of
Arabidopsis lyrata [33], T. parvula [34], and Brassica rapa [35] have
been published. A first draft of genome sequences of T. salsuginea,
carried out at the Joint Genome Institute (JGI—US Department
of Energy) under the coordination of Schumaker, Wing, and
Mitchell-Olds, is available online (http://www.phytozome.net/
thellungiella.php#A, [17]), and the Capsella rubella genome
(http://tinyurl.com/jgi-plans) is currently being sequenced. The
analysis of the T. parvula genome has highlighted the presence of
over 3,000 predicted open reading frames (ORFs) without BLAST
hits in A. thaliana, a portion of which may represent novel stress
tolerance genes, as well as additional, i.e., duplicated, copies of
Arabidopsis genes known to be important to stress responses such
as HKT1 and SnRK2s [34]. The Gene Ontology (GO) classification of the over 28,000 predicted ORFs of T. parvula has also
30
Giorgia Batelli et al.
shown that subcategories of the “biological process” category were
over- (“response to abiotic or biotic stimulus”) or underrepresented (“signal transduction”) in T. parvula compared to
Arabidopsis, suggesting a different strategy of response to abiotic
stresses in Thellungiella compared to Arabidopsis [34].
In Subheading 3.4, the tools and resources for comparative
genomics in crucifers are described with the recently published
T. parvula genome as an example [34]. T. parvula has a genome
slightly larger than A. thaliana distributed in seven pairs of chromosomes [36]. With combinations of 454 and Illumina platforms, and
assemblers based on different algorithms, contigs of chromosomearm length (N50 = 5.29 Mb) were produced. The version 2.0 of the
genome sequence and annotation are available through http://
thellungiella.org/.
2
2.1
Materials
Plant Growth
2.1.1 Substrates
1. When sowing in soil, a loose and uniform soil potting type is
required. Peat-based commercial mixes ensure good water
retention as well as good draining properties.
2. Some specific commercial formulations have added fertilizers
or bio-protectants against pathogens, such as Bacillus subtilis.
3. If root measures are to be performed, inert/light substrates may
be preferred, such as perlite/vermiculite/light gravel. When
these substrates are used, water retention may be enhanced by
mixing them with coir or other fibrous substrates.
2.1.2 Growing
Containers
1. The use of standard 8–10 cm diameter plastic pots is frequently
adequate.
2. If fine substrates are used, a thin filter on the pot bottom may
avoid loss of the growth medium.
3. Plastic tubs such as 90 × 60 × 20 cm can also be used, provided
they are equipped with appropriate drainage systems in order
to avoid water stagnation (www.thllungiella.org).
2.1.3 Water
and Nutrients
1. Plants must be watered frequently to maintain a moist root
environment and to avoid flooding in order to reduce the risk
of anoxia.
2. It is emphasized that no particular watering schedule is appropriate, but must be adapted to the specific conditions present.
3. Once a week it is usually appropriate to apply a modified Hoagland
nutrient solution (13.00–18.00 mM N; 0.70–1.50 mM P2O5;
3.00–5.50 mM K2O; 1.50–6.00 mM SO3; 1.25–3.50 mM Mg;
3.25–5.00 mM Ca; 10.00–40.00 μM Fe EDTA; 0.50–1.00 μM
Cu; 4.00–7.00 μM Zn; 15.00–40.00 μM B; 10.00–15.00 μM
Mn; 0.50–1.00 μM Mo).
Arabidopsis Related Model Species (ARMS) in Salt Stress Research
2.1.4
Pests
31
1. Bioplasts such as fungi, viruses, and especially insects are often
underestimated deterrents to successful molecular manipulations. Many biological solutions (mainly biopesticides) are
available and proven effective for greenhouse/growth room
environments.
2. Manufacturer instructions should be strictly followed when
handling pesticides, but since Arabidopsis and Thellungiella
spp. are not specifically mentioned in the labels, dose test
experiments might be required to establish optimal conditions,
considering also that specific mutants might respond differently (see Note 1).
2.2 Media for
Transformation Using
the Floral-Dip Method
2.2.1 Bacterial Growth
Media for AgrobacteriumMediated Transformation
1. Yeast extract peptone (YEP): Yeast extract 10 g/L, peptone
10 g/L, sodium chloride 5 g/L. Adjust pH to 7.0 with 0.1N
potassium hydroxide (KOH). For plates, add agar 15 g/L.
Autoclave-sterilize, typically for 20 min at 121 °C (steam at
15 psi).
2. LB medium: Tryptone 10 g/L, yeast extract 5 g/L, sodium
chloride 10 g/L. Adjust pH to 7.0 with 0.1N potassium
hydroxide (KOH). For plates, add agar 15 g/L. Autoclavesterilize, typically for 20 min at 121 °C (steam at 15 psi).
3. Antibiotics and other heat-labile substances are added after
medium is cooled to 55 °C in water bath, prior to pouring
medium into suitable container (Petri dishes or else).
4. Antibiotic dosage: Kanamycin 50 mg/L; rifampicin 30 mg/L;
gentamicin 30 mg/L; ticarcillin 30 mg/L. Use as required.
2.2.2 Agrobacterium
Infiltration Medium
MS salt (1/2×) 2.2 g/L, B5 vitamins (1×), sucrose 50 g/L, MES
0.5 g/L, N6-benzylaminopurine (BA) 0.01 mg/L, Silwet L-77
200 μL/L. Adjust to pH 5.7.
2.2.3 Antibiotics
Preparation
Rifampicin, kanamycin, gentamicin, and ticarcillin antibiotic
stocks: these compounds are heat labile and cannot be sterilized by
autoclaving.
1. Kanamycin (30 mg/mL)—(dissolve 300 mg in 10 mL H2O).
2. Ticarcillin (100 mg/mL)—(dissolve 1.0 g in 10 mL H2O).
3. Gentamicin (30 mg/mL)—(dissolve 300 mg in 10 mL H2O).
4. Rifampicin (30 mg/mL)—(dissolve 300 mg in 10 mL of
methanol).
5. Filter-sterilize using a syringe and a 0.22 μm membrane filter.
Aliquot into 1 mL samples and store up to 3 months at
−20 °C.
2.3 Comparative
Genomics
We list programs that aid in the comparison and viewing of genome
sequences. It is, however, to be understood that the computational
32
Giorgia Batelli et al.
tools for the representation and analysis of genome sequences
undergo rapid development and changes.
1. Nucmer, included in the MUMmer package [37], is the software
suitable for identifying global colinearity between two
genomes. For installation and documentation, see http://
mummer.sourceforge.net/. A computer with a UNIX operating
system will be required.
2. Circos visualization tool for comparative genomics [38]. For
installation and tutorial, see http://circos.ca/. Also useful is
the Google discussion group: http://groups.google.com/
group/circos-data-visualization.
3. MAUVE [39] sequence alignment tool is suitable for identifying synteny as well as chromosome-scale inversions. For
installation and documentation, see http://gel.ahabs.wisc.edu/
mauve/.
4. Genome-wide as well as localized comparisons of sequences
can be performed using the comparative genomics platform
available at CoGE http://genomevolution.org/CoGe/.
3
Methods
3.1 Growth and
Maintenance of
Thellungiella spp.
1. Whereas the Arabidopsis cycle can be completed in 6–10 weeks,
T. salsuginea requires from 16 to 20 weeks from sowing to
harvest, in comparison to 12–16 weeks for T. parvula (Fig. 1).
2. Growing conditions that heavily influence flowering time and
life cycle include light (short/long day), temperature, watering
and nutrition, plant density, containers, presence of pests,
Fig. 1 (a) Adult plants of Arabidopsis thaliana and Thellungiella parvula. (b) An adult flowering plant of
Thellungiella parvula
Arabidopsis Related Model Species (ARMS) in Salt Stress Research
33
and the type of facilities used, e.g., growth chambers vs.
greenhouse.
3. When optimal conditions are maintained as uniformly and
consistently as possible, shorter harvesting times and higher
seed quality result. Any prolonged stress will result in weak,
unhealthy plants, delayed and poorer harvest, or outright
plant losses.
4. It is crucial to understand that general plant health has a much
larger effect on Agrobacterium-based transformation of
Thellungiella compared to Arabidopsis. The activation of the
innate immune response controls significantly the ability of
Agrobacterium to successfully mediate gene transfer in
Arabidopsis. Because Thellungiella species are perennial-like
(they continue growth after flowering), it is tempting for convenience to use old plants that continue to flower, but because of
previous stress and pathogen episodes (root aphids are a
common example), transformation frequency will be very low.
5. Both T. salsuginea and T. parvula show a greater degree of
seed germination variability than A. thaliana. This is related to
the higher percentage of dormant seeds generally present in
Thellungiella spp. It is good practice therefore to stratify seeds
for several days (1 week) at 4 °C in the dark and to work with
seeds of uniform age and good quality.
6. Cold treatment of dry seeds is not effective, whereas seeds
maintained in a constant moist environment (either water
suspension or moist soil) for 7 days will germinate promptly
and more uniformly.
3.1.1 Seed Storage
and Preservation
1. Seeds should be dried to a moisture content of 5–6 %, by
air-drying for about 4 weeks or in desiccators with Drierite or
silica gel for 3 or 4 days.
2. Seeds are then commonly stored at room temperature (24–
27 °C) in scintillation vials or paper envelopes kept in desiccators.
In these conditions, seeds will be viable for at least 3 years and
we have experienced Thellungiella seeds to preserve viability
for 10–12 years. For longer storage time, seeds should be
sealed in moisture-proof containers and kept at 4 °C.
3.1.2 Seed Germination
1. For sowing, seeds are mixed with sand and distributed on the
soil using a salt/pepper shaker to facilitate uniform dispersion.
Alternatively, seeds kept at 4 °C in water in the dark for 7 days are
further diluted in abundant water and uniformly distributed
on soil using a squeeze bottle.
2. Final density of sowing depends on experimental purposes and
dictates the ratio of seeds and sand/water to be used.
34
Giorgia Batelli et al.
3. After making sure the mixture of seeds and sand (or the dispersion
of seeds in water) is homogeneous, proceed with sowing on wellwatered soil.
4. For low density in small pots or limited surfaces (10–20 plants
in 4–5 in. pots or in the case of celled trays), dry seeds placed
on a piece of paper can be effectively dispersed by tapping.
5. Seeds should not be covered with soil. Pots or containers
sowed with dry seeds on moist soil can then be placed at 4 °C
in the dark for 7–14 days.
6. Containers should be checked periodically and moved out of
the cold room as soon as germination is achieved. This will
avoid etiolation of the plantlets.
3.1.3 Growth Conditions
1. Temperature requirements do not differ greatly for T. salsuginea
and T. parvula and a range between 24 °C at daytime and
18 °C at nighttime with a 16 h photoperiod and a light intensity of 130–150 μmol/m2 s is adequate. Both species grow well
even at wider temperature and light ranges (typically encountered in a greenhouse compared to a growth room or growth
chamber) since plants in nature undergo wider day/night fluctuations than those experienced in controlled environments.
Arabidopsis, as well as Thellungiella spp., adapt well to more
uniform regimens.
2. T. parvula benefits from intense photosynthetic active radiation.
Additional lighting might be required in the greenhouse
mainly to control photoperiod, depending also on the season
and the location.
3. Plants should be watered regularly and thoroughly from
above, with a gentle shower, or by infiltration from the bottom. To reduce fungal and algal growth and infestation of
fungus gnats (Mycetophilidae and Sciaridae), containers/soil
should drain well the extra water and the soil surface should
be allowed to dry between watering.
4. Young plantlets will require thinning at about 3 weeks from germination. This will allow the remaining plants to grow stronger.
3.1.4 Vernalization
1. T. salsuginea requires vernalization. In order to promote uniform flowering, plants of 4–5 cm in diameter (Fig. 2b) should
be well watered and then placed at 4 °C for 21–28 days, with a
16-h photoperiod. Plants should not require much care at this
stage. The use of plastic domes (generally utilized for propagation) can reduce dehydration and watering intervention.
2. Vernalization can be initiated any time after germination,
but in general older plants require longer vernalization times.
After vernalization, plants are placed at normal growing
temperature and watered regularly (see Note 2).
Arabidopsis Related Model Species (ARMS) in Salt Stress Research
35
Fig. 2 Thellungiella salsuginea plants resistant to glufosinate. Panel (a) shows
glufosinate-resistant seedlings of T. salsuginea transformed with vector pSK115
15 days after treatment with 5 mg/L. ( b ) An aerial view of flats of young
T. salsuginea seedlings resistant to glufosinate after transfer from the selection
tray (source: http://thellungiella.org/)
3.1.5 Post-flowering
Maintenance
1. T. salsuginea bolts and grows upright, while T. parvula has a
recumbent habit (Fig. 1). It is helpful to train older plants of
both species which can form numerous branches by tying them
to wooden skewers. Tied bundles also facilitate final harvest.
When plants begin to dry, whole stalks can be cut for seed
collection.
2. It is important to recognize that Thellungiella seeds will mature
much more asynchronously than Arabidopsis. It is advisable to
repeat harvest as needed (see Note 3).
3. Regular and frequent watering is required during flowering
and until siliques are well formed (see Note 4).
4. Watering should be gradually decreased in proximity of plant
senescence.
36
Giorgia Batelli et al.
3.1.6 Seed Harvest
1. Harvest generally occurs 2–4 weeks after termination of watering.
When appropriate, plants can be harvested in bulk by cutting
the stalks at the base and letting them undisturbed to dry completely on large sheets of paper (brown packing type).
2. Seeds are threshed by hand rolling and cleaned through sieves
and strainers. Common tea strainers are very useful. Several
passages through different size sieves might be necessary to
clean seeds from all residues and debris.
3.1.7
Pests
1. Scouting for pests should be done regularly, since high density of
plants of the same species in a controlled environment, at optimal growing conditions, in the absence of natural antagonists,
increases the chance of pest attacks. Early detection and prompt
intervention are critical to avoid major pest explosions.
2. Good watering and fertilization regimens, as well as adequate
ventilation, are essential to ensure healthy plants. Vigorous
plants are less susceptible to pests and diseases and will also
respond better to treatments.
3. A short list of most common problems we encountered and
effective measures to solve them follows, with indications on
specifically biological antagonists (see Note 5).
(a) Powdery mildew (Erysiphe spp.)—sulfur is effective.
(b) Fungus gnats (Mycetophilidae and Sciaridae)—controlled
by the nematode Steinernema feltiae and by Bacillus
thuringiensis israelensis.
(c) Thrips (Thysanoptera)—generally not as lethal as on
Arabidopsis, still require care and spraying with insecticides
approved for thrips, in case of heavy infestation. The predatory mites Neoseiulus cucumeris and Hypoaspis miles have
been proven effective.
(d) Aphids—a prompt intervention is key, as well as all measures
aimed at limiting insect presence in the greenhouse (screens
on all openings, reduced traffic, use of coats when entering
each greenhouse contained area). Insecticidal soaps are helpful, though repeated or too extensive treatments can damage
tender parts of the plants (inflorescences).
(e) Root aphids—since they affect roots, they are not as easily
detected as other aphids. Symptoms are pronounced leaf
yellowing and slow growth. Plants stop developing with
consequent dramatic reduced seed maturation and yield
(see Note 6).
3.2 Other Candidate
Halophytic/
Extremophyle
Model Species
Whereas Thellungiella spp. have been so far the most studied
Arabidopsis relatives, other related crucifers are subject to an
increasing interest due to specific characteristics, such as high tolerance to heavy metals in Thlaspi spp. [40–43]. A short description
Arabidopsis Related Model Species (ARMS) in Salt Stress Research
37
of some of the most promising candidates is provided, describing
main features of their life cycle and plant development. Table 1
summarizes light, photoperiod, temperature, and watering requirements for the described species.
3.2.1 Barbarea verna
B. verna is a biennial herb native from Eastern Europe and southwestern Asia, usually found in damp soils, roadsides, or waste
places.
1. Seeds should be planted shallowly, at about 1 cm depth.
Germination occurs in 1–2 weeks from sowing, while in
4 weeks about two to four true leaves are found.
2. Flowering starts about 6–7 weeks from sowing, when the plant
has about ten leaves.
3. At full maturity, the plant may reach size of 0.3 m width and
0.3 m height.
4. Flowers are hermaphrodite and the plant is self-fertile.
3.2.2 Capsella
bursa-pastoris
C. bursa-pastoris, also known as shepherd purse, is an annual plant,
native from Eastern Europe, usually found in arable lands, waste
areas, and road margins.
1. Seeds will germinate in 1–2 weeks from sowing and will present
2–6 leaves in 4 weeks and about 15–20 leaves in 6 weeks, when
first flowers may appear.
2. At full maturity, the plant may reach size of 0.3 m width and
0.2–0.5 m height.
3. Flowers are hermaphrodite and the plant is self-fertile.
3.2.3 Descurainia
pinnata
D. pinnata is an annual or biennial plant, native from desert regions
from Nevada southward into north central and northwestern
Mexico. It is also native to deserts of North Africa and the Middle
East. It is usually found in sandy fields, gravel, white saline areas,
dunes, open desert, waste ground, disturbed sites, open woods,
prairies, glades, roadsides, and railroads. It may grow in sterile
soils, such as sandy or gravelly, although in fertile soil, the plant will
be larger in size.
1. Germination occurs in about 2 weeks from sowing, and the
plant will develop a rosette in about 4–6 weeks, with stems in
which flowers will thereafter appear.
2. At this stage, blooming will start and last about 2 months. At full
maturity, the plant may reach size of 0.3 m width and 0.6 m
height.
3. Flowers are hermaphrodite and the plant is self-fertile.
4. Plants exhibit extreme soil desiccation tolerance, based almost
entirely on root growth characteristics.
Shepherd’s purse
Western tansy mustard
Conil yellow,
Mediterranean
Common pepperweed,
prairie peppergrass
Virginia pepperweed
Conil blue
Hedge mustard
Pennycress
Northern rock cress
Capsella
bursa-pastoris
Descurainia
pinnata
Hirschfeldia
incana
Lepidium spp.
Malcolmia
triloba
Sisymbrium
officinale
Thlaspi arvense
Arabidopsis
lyrata
Cakile maritima Sea rocket
Yellow flower, winter
cress
Barbarea verna
Common name
250–1,000
n=9
2n = 18
2n = 16
n=7
2n = 14
n=7
2n = 14
n = 7, 14
2n = 28
n = 16
2n = 32
2n = 14
n=7
n=7
2n = 28
500–800
250–400
250–1,000
500–1,000
500–1,000
100–1,000
500–1,000
500–1,000
16/8
16/8
16/8
16/8
16/8
16/8
16/8
16/8
16/8
16/8
12–25
8–16
12–20
14–24
12–20
15–24
16–24
18–24
14–24
14–24
[16, 44, 47, 49,
50]
[16, 40–42, 44]
[16, 44, 48]
[16, 40, 44–47]
[16, 44]
References
Not frequent/scarce
Frequent
Frequent
Frequent, but avoid flooding
[65–69]
[33, 61–64]
[16, 43, 44, 47]
[16, 44, 47]
Frequent, but avoid flooding, [16, 44]
may tolerate mild drought
Can tolerate mild to severe
drought stress
Can tolerate mild drought
stress, avoid flooding
Can tolerate drought stress
Can tolerate mild drought
stress
Frequent, avoid flooding
Radiation
Photoperiod Temperature
(μmol/m2 s) (h light/dark) (°C, min–max) Watering
n = 8, 16
250–1,000
2n = 16, 32
n=8
2n = 16
Ploidy
Table 1
Other candidate halophytic/extremophile model species
38
Giorgia Batelli et al.
Arabidopsis Related Model Species (ARMS) in Salt Stress Research
3.2.4 Hirschfeldia incana
39
H. incana is a perennial plant, native to the Mediterranean basin,
usually found in waste places, roadsides, and canyons.
1. Germination occurs in 1–2 weeks from sowing, with development of a rosette of lobed leaves within 4–5 weeks, from which
stems will develop, covered by dense, soft, and white hairs.
2. Flowers will set between 6 and 12 weeks from sowing and
blooming may last a few months.
3. At full maturity, the plant may reach size of 0.5 m width and
1 m height.
4. Flowers are hermaphrodite and the plant is self-fertile.
3.2.5 Lepidium spp.
Lepidium spp. are perennial plants native to Eurasia, but spread in all
continents, except Antarctica. They are usually found in sandy soil,
waste places, coastal regions, sea cliffs, dry creek beds, and dry plains.
1. Germination occurs in 1 week from sowing.
2. The plant reaches full size (0.10–0.50 cm tall) within 8–10
weeks, in the shape of a rosette of lobed leaves from which the
flowering stems will develop.
3. Flowers are hermaphrodite and the plant is self-fertile.
4. Plants present considerable salt tolerance close to Thellungiella [16].
3.2.6 Malcolmia triloba
M. triloba is an annual plant native to Asia and the
Mediterranean region and usually found in waste and disturbed
areas, gravel pits.
1. Seeds germinate in 1–2 weeks.
2. Flowers will appear in 6–8 weeks, when plant will reach their
full size (0.15–0.50 m height).
3. Flowers are hermaphrodite and the plant is self-fertile.
3.2.7 Sisymbrium
officinale
S. officinale is an annual or biennial plant native to the
Mediterranean region and usually found in disturbed sites.
1. Germination occurs in 1–2 weeks.
2. Plants reach full size (up to 1 m height) in 5–8 weeks, when
flowering starts. Blooming lasts 2 months.
3. Flowers are hermaphrodite and the plant is self-fertile.
3.2.8 Thlaspi arvense
T. arvense is an annual plant native to central and western Asia,
usually found in roadsides and waste places.
1. Germination will occur in 2 weeks, and the plant will develop
the basal rosette of glabrous leaves within 4–6 weeks.
2. Flowering will start 8–12 weeks after sowing, when plants have
reached their full size (about 0.75 m height).
40
Giorgia Batelli et al.
3. Flowers are hermaphrodite and the plant is self-fertile.
4. Thlaspi species are notably tolerant of heavy metals.
3.2.9 Arabidopsis lyrata
A. lyrata (also known as northern rock cress) is the closest wellstudied relative of A. thaliana. It may complete its cycle within a
single season, but is normally a perennial. Native to cool temperate
areas around the Arctic, it is usually found in disturbed habitats, with
scarce vegetative competition, such as humid rocky places, coastal
cliffs, pine forests, or sandbars.
1. Germination time is within 2–3 weeks from sowing and the
plant presents a simple rosette within 4–6 additional weeks.
2. It may reproduce vegetatively via stolons or gamically via insect
pollination, producing a high number of seeds.
3. Flowering starts 8–12 weeks after sowing.
4. Differently from A. thaliana, A. lyrata plants are outcrossing
diploids.
5. The A. lyrata genome has been sequenced.
3.2.10 Cakile maritima
C. maritima (also known as sea rocket) is an annual plant sometimes
behaving as perennial. Native to Europe, is an invasive species in
North America that grows easily along the coast often in sand
dunes.
1. Germination occurs in 2–3 weeks.
2. Flowers will set after 6–8 weeks, when plants reach their full
size of 0.3 m height.
3. The plant is easily grown on a well-drained sandy soil at high
solar radiation and can tolerate salt exposure.
3.3 Genetic
Transformation Using
Agrobacterium
3.3.1 Agrobacterium
Transformation
1. Agrobacterium tumefaciens transformation of T. parvula and
T. salsuginea can be carried out following the floral-dip method
widely used for Arabidopsis [27].
2. Agrobacterium-mediated transformation has been successfully
obtained for both species, with similar degrees of efficiency
(0.1–2 %), either by flower-dip or spraying techniques, using
the bacterial strain GV3101.
3. Since the aim of both techniques is to infect the maximum
number of unopened floral buds, and considering that flowering
is not synchronous, several Agrobacterium treatments are
required, generally at 3–5 day intervals.
4. For random activation tagging mutagenesis, the vector pSKI15
[51] is used.
5. Start from a frozen glycerol stock of A. tumefaciens GV3101
(pMP90RK) (C58 derivative) stored at −80 °C.
Arabidopsis Related Model Species (ARMS) in Salt Stress Research
41
6. The Agrobacterium strain GV3101 was transformed with a
binary vector (pSKI15) for activation T-DNA insertional
mutagenesis. The pSKI15 plasmid contains four transcriptional
enhancers derived from the cauliflower mosaic virus (CaMV)
35S RNA promoter cloned in tandem near the right border
sequence and an expression cassette for herbicide resistance
(bar gene, encoding phosphinothricin acetyltransferase).
7. Agrobacterium selectable markers are:
(a) Resistance to ampicillin/Ticar/carbenicillin (pSKI15).
(b) Resistance to gentamicin (Ti plasmid).
(c) Resistance to kanamycin (Ti plasmid).
(d) Resistance to rifampicin (GV3101).
8. The T-DNA also contains a bacterial origin of replication
(oriC) for plasmid rescue in Escherichia coli.
9. To culture Agrobacterium for transforming plants, chip off
pieces of frozen culture with a 200 μL pipette tip from the
−80 °C Agrobacterium stock and inoculate 5 mL YEP or LB
medium plus appropriate antibiotics in culture tubes
(25 × 150 mm).
10. Incubate on a shaker in the dark at 28 °C for 24 h at 230 rpm.
The medium should look saturated (cloudy, OD600 = 1.5–2.0).
11. Add 3 mL of culture to a larger amount of medium (YEP or
LB + antibiotics) in a flask, according to the amount of plants
to transform. 500 mL of grown culture are sufficient for floral
dipping of three pots of 5 in. in diameter containing ten plants
each. The volume of medium should be no more than 1/5 of
the volume of the flask, to assure proper ventilation during
shaking.
12. Incubate on a shaker at 28 °C and 230 rpm to an OD600
of = 1.5–2.0 (16–18 h).
13. Centrifuge the culture to form a pellet (4,500 × g for 20 min).
Decant the supernatant and add about half of the original
culture volume of infiltration medium (see Subheading 2) into
the bottle.
14. Resuspend the pellet completely by vigorous shaking and
dilute the suspension with infiltration medium to a final OD600
of 0.8–1.
15. Proceed with floral dip or plant spraying.
16. In order to avoid rapid dehydration of the Agrobacterium infiltration solution from the flowers, plants should be protected
from air and covered for a period of 24–36 h (flowers should
stay wet for at least 24 h). This can be obtained in situ, covering
the plants with plastic sheets, or moving plants to closed cabinets
or similar structures (see Note 7).
42
Giorgia Batelli et al.
17. Post-transformation care is identical to the standard plant
growth conditions described above. As indicated early, the
health of the plants has a dramatic effect on transformation
efficiency and may even cause total failure.
3.3.2 Identification of T1
Plants Based on Herbicide
Resistance Gene
Expression
For a large amount of seeds, it is convenient to screen transformants
directly in the soil; hence the use of a herbicide-tolerance selective
marker is advantageous. The following protocol has been extensively utilized for T. salsuginea (and A. thaliana), for the production
of large tag-insertional mutagenesis line collections:
1. Typically, 1 g of seeds harvested from Agrobacterium-treated
plants is uniformly sowed in greenhouse flats (53 × 27 × 5 cm)
with loose soil previously well watered. Use a sand/seeds mixture
(1 part of seed to 2–4 parts of sand) and distribute the seeds
using the salt-pepper shaker method. Proceed as previously
described for stratification and germination.
2. Begin spraying the seedlings with the herbicide immediately
after plantlets form the first pair of true leaves. For herbicide
application, use a final concentration of 5 mg/L of glufosinate
ammonium (active ingredient). One liter of diluted solution is
sufficient for about ten flats (see Note 8).
3. Spray for 3–5 consecutive days each week until clear distinction
between dead vs. surviving plants is visible. Repeated herbicide
treatments may be required depending on plant density and
uneven germination.
4. Transformed plants should continue to grow undisturbed
(Fig. 2a). Mark survivor plants from earlier germination with
toothpicks, while secondary germination is starting. This is
important to help avoid later generating non-transformed
plants that escape herbicidal effects.
5. Transplant putative transformants to 25 × 25 × 5 cm trays, 25
plants/tray (Fig. 2b).
6. Protect plants from transplant shock by immediately watering
them and covering them with propagation domes and/or by
shading for 1–2 days.
7. Once plants are established (4–5 cm diameter rosette, 4–6
weeks), vernalize at 4 °C, following the vernalization procedure
above described.
8. For large-scale mutagenesis, seeds collected from each tray are
pooled for further study. Alternatively, T1 plants can be isolated
in vitro (see Note 9).
9. Frequency of transformed plants recovered can be determined
by PCR of the insertion plasmid and or herbicide gene
sequences.
Arabidopsis Related Model Species (ARMS) in Salt Stress Research
43
Table 2
Genomic tools developed for Thellungiella spp.
Tissue of
origin/stress
treatment
Notes
Database
accession nos.
About 1,700 ESTs
sequenced
GenBank, see paper
for Acc. Nos.
[52]
6,578 ESTs were
EST
T. salsuginea Adult plants,
sequenced from
collection
Yukon
aboveground
cDNA libraries
tissue/
obtained from
chilling,
different
freezing, salt
treatments
acclimation,
salt shock,
drought stress
GenBank, Acc. Nos.
DN772677–
DN779205
[31]
Tool
Species
EST
T. salsuginea Seedlings/salt
collection
cDNA
library
T. salsuginea Several tissues/ 20,000 full-length
enriched
Shandong
chilling,
Thellungiella
freezing, salt,
cDNAs (RTFL)
ABA
were generated
EST
T. salsuginea Whole plants/
collection
Shandong
salt
946 EST sequences
generated
Reference
DDBJ, Acc. Nos.
[53]
BY800476–
BY835646; Clones
available at:
http://www.brc.
riken.go.jp/lab/
epd/Eng/
GenBank, Acc. Nos.
EC598928–
EC599965
[54]
Microarray
chip
T. salsuginea N.A.a
Yukon
ESTs spotted on the N.A.a
chip. Specifically
developed for
Thellungiella, it
can analyze
“novel” genes
[32]
BIBAC
Library
T. salsuginea Partially
digested
T. salsuginea
genomic
DNA
N.A.a
BIBAC (binary
bacterial artificial
chromosome)
library expected to
cover 4× genome
of salsuginea was
generated
[55]
a
Not applicable
3.4 Tools for
Comparative
Genomics Analyses of
Arabidopsis Relatives:
An Example with
T. parvula
In addition to the genome sequences [17, 34], several genomic
tools have been developed for T. parvula and T. salsuginea, including EST and cDNA libraries, BiBAC libraries, and microarray chips
([11], Table 2). Below, we describe methods to compare, align,
and assemble genomic contigs and scaffolds of a newly sequenced
crucifer species.
44
Giorgia Batelli et al.
The chromosome structures of plant species within the
Brassicaceae family have been studied with comparative chromosome painting (CCP) techniques using pools of A. thaliana BACs
as probe [56, 57]. The ancestral Brassicaceae genome was inferred
to contain 24 ancestral karyotype (AK) blocks, named A to X,
which constitute eight chromosomes [56]. Genomes of most crucifer species consist of combinations of these AK blocks in different
numbers of chromosomes [57]. For example, crucifers in the
Lineage II with 2n = 14 karyotypes, including Thellungiella,
evolved from eight ancestral Brassicaceae chromosomes to genomes
with seven chromosomes after multiple translocation and inversion
events [58]. As an example, here we describe current tools to compare the genomes of a newly sequenced ARMS (T. parvula, Tp) to
the genome of the model plant, A. thaliana (At). Comparison of
the larger Tp contigs with the AK blocks in the At genome helped
the assembly of the seven Tp chromosome pseudomolecules [34].
3.4.1 Identification
and Visualization of Global
Synteny Using Nucmer
and Circos
Alignment of Genomes
Using Nucmer
1. The genome contigs and scaffolds were aligned as FASTA files
with the At genome sequence using Nucmer. An example of
command line and parameter is “$ nucmer --maxmatch
--maxgap 1000 --prefix <project_name> <input_genome_
sequence_file_name.fasta> <At_genome_file_name.fasta>”
2. Run delta-filter. An example of command line and parameter is
“$ delta-filter -r -q -l 500 project_name.delta> project_name.
filter”
3. Run show-coords. An example of command line and parameter
is “$ show-coords -c -d -l -r -T project_name.filter> project_
name.txt”
4. The resulted file should contain the coordinates of genomic
regions that show sequence similarity with genomic regions
from At in text delimited file format.
Visualization of Nucmer
Results with Circos
1. Circos is a visualization tool for comparative genomics that
runs with a configuration file. The alignment results obtained
using Nucmer can be fed into Circos as <links> in the configuration file. An example of visualization comparing Tp chromosome 7 with five At chromosomes is shown in Fig. 3a.
2. When Arabidopsis-relative crucifer genome sequences are compared with those of At, extensive colinearity is usually found. If
a karyotype model with AK blocks is available [9] for the
Arabidopsis-relative species, the contig/scaffolds can be
mapped to the model by comparing them with the At genome.
Using the Nucmer alignment result obtained as described earlier in this section (Alignment of Genomes Using Nucmer), identify which At AK blocks the contig/scaffolds show colinearity.
The coordinates of AK blocks in the At genome are available in
Schranz et al. [56]. For example, Tp contigs c16 and c19 are
colinear with the AK block S in At genome (Fig. 3a). Similarly,
Arabidopsis Related Model Species (ARMS) in Salt Stress Research
45
Fig. 3 Tools for identification and visualization of synteny between the genome sequences of Thellungiella
parvula (Tp) and Arabidopsis thaliana (At ). (a) Synteny between Tp chromosome 7 and At genome. Tp genome
contigs that are colinear to the ancient karyotype (AK) blocks S, T, and U are assembled to Tp chromosome 7
according to the model developed by comparative chromosome painting (CCP) results [56–58]. Synteny
regions were identified using Nucmer [37] and visualized with Circos [38]. The outer histogram shows the
distribution of genes, retrotransposons, DNA transposons, and unidentified repetitive sequences in blue,
orange, yellow, and green, respectively. The red arrow indicates the centromeric region of the Tp chromosome 7.
(b) Synteny blocks between At chromosome 4 and Tp chromosome 7 were identified by MAUVE [39]. Genomic
regions with sequence similarity were indicated with the same color between the two chromosomes. Red
arrows identify inversions. (c) Identification of genome-wide synteny between At and Tp using SynMap
(http://genomevolution.org/r/4gyq) included in the CoGe tools [59, 60]. The protein-coding sequences (CDSs)
of At and Tp were compared and dots were plotted where the coding sequences from the two species show
similarity. Colors of the dots indicate the synonymous nucleotide substitution ratio (Ks) as indexed in (d)
c32, c6, c14, and c17 showed synteny with AK blocks T and U
in At. Since chromosome 7 of Thellungiella species consists of
AK blocks S, T, and U [57, 58], the contigs c16, c19, c32, c6,
c14, and c17 can be mapped to Tp chromosome 7 (Fig. 3a).
3. If genome annotation is available, the distribution of coding
sequence (CDS) and transposable element (TE) can be plotted
in the Circos diagram as a histogram. The correct chromosome
assembly will reveal a TE-rich centromeric region, as indicated
with a red arrow in Fig. 3a.
46
Giorgia Batelli et al.
3.4.2 Alignment of
Genomes Using MAUVE
MAUVE [39] is a sequence alignment tool suitable for identifying
synteny as well as chromosome scale inversions.
1. MAUVE runs with a graphical user interface and takes two or
more genome sequences in FASTA format as input. Figure 3b
displays an example of MAUVE results comparing the At
chromosome 4 and Tp chromosome 7, as assembled above
using Nucmer.
2. When mapping the genome contig/scaffolds to the chromosome model with AK blocks, MAUVE is particularly useful
support tool in deciding the direction of each contig/scaffold.
3.4.3 Comparison
of Tp and At Genomes
Using CoGe
Genome-Wide Comparison
Using SynMap
1. The version 2 of Tp genome and annotation is available for comparative studies in CoGe database (http://genomevolution.org/
CoGe/) [59, 60]. Using SynMap, a tool included in CoGe, the
Tp genomes and coding sequences can be compared to any
genome deposited in CoGe. Comparison of the entire Tp CDSs
with those of At is available in http://genomevolution.org/
r/4gyq. The comparison is visualized in a dot plot (Fig. 3c).
2. When the annotation of CDS available for both species is being
compared, SynMap calculates synonymous substitution rate
(Ks) for all homologous CDS pairs between the two species in
comparison and generates a histogram of Ks values (Fig. 3d).
The dot plot will be colored according to the Ks value of the
CDS pairs. In Fig. 3c, the lines with yellow dots consist of CDS
pairs with Ks values around or less than 0.3, while red dots
indicate Ks values around or larger than 0.5. CDS pairs with Ks
values higher than 0.8 are shown with blue dots (Fig. 3d).
SynMap generates links to the sequences of all homologous
CDS pairs, as well as the list of tandem duplicated CDSs.
3. Part of the SynMap can be magnified by clicking and dragging
with the mouse cursor. The magnification will be shown in a separate window. Clicking any dots in this magnified window will
open another tool, GEvo, for comparison in higher resolution.
Localized Comparison
of Genomic Features
Using GEvo
1. There are two different ways to start GEvo. Firstly, clicking a
dot in the magnified SynMap window will open the GEvo
analysis around the selected dot. Secondly, the name of the
CDS can be directly entered from the GEvo window (http://
genomevolution.org/CoGe/GEvo.pl). Entering the CDS
name in the “Name:” window will automatically bring up the
genomic sequences around the CDS. For example, entering
“AT1G18710” as the name for Sequence 1 and “Tp1g16690”
for Sequence 2 and pressing “Run GEvo Analysis!” button will
bring up Fig. 4. More than two sequences can be compared by
clicking “Add sequence.”
2. The pink ribbons in the example presented in Fig. 4 indicate
genomic regions showing homology or high-scoring segment
Arabidopsis Related Model Species (ARMS) in Salt Stress Research
47
Fig. 4 Comparison of homologous genomic regions of At and Tp using GEvo. T. parvula genome sequence and
annotation is available for comparative genomic studies in CoGe database (http://genomevolution.org/CoGe/
index.pl). Shown is an example snapshot of GEvo results (http://genomevolution.org/CoGe/GEvo.pl), part of the
CoGe toolbox, comparing the genomic regions near the AtMYB47 (At1g18710) and the three putative TpMYB47
homologs (Tp1g16690, Tp1g16700, and Tp1g16710). Pink ribbons indicate blocks with sequence similarity
between the two species. Gene models are shown as cylinders with exons, introns, and noncoding conserved
sequences in green, gray, and blue colors
pairs (HSPs). By clicking on the ribbon, the link between
homologous regions will be toggled on and the alignment
between the two regions will appear in a separate window.
The link can be toggled off by clicking one of the ribbons connected by it again. The gene models are shown as cylinders.
Clicking the cylinder will bring up a separate window containing the annotation and information of the CDS, as well as the
link to the CDS sequence.
3. The example in Fig. 4 shows a local tandem duplication event
specific to T. parvula, where putative Tp homologs of
AtMYB47 were amplified to three copies. GEvo is suitable for
browsing and analyzing local tandem duplication in detail.
4
Notes
1. If using pesticides, it is imperative to alternate product types to
reduce occurrence of resistance in the pest population.
Furthermore, not all products are licensed to be used in controlled environments and regulations and product availability
differ in different countries. Restricted entry intervals (REI),
i.e., the period of time after plants and/or soil is treated with a
pesticide during which restrictions on entry are in effect to protect
48
Giorgia Batelli et al.
persons from potential exposure to hazardous levels of pesticide
residues and protective measures should be adopted.
2. The EMS mutant162 of T. halophila does not require vernalization. This mutant flowers very early and has a smaller size
allowing it to be manipulated much like Arabidopsis (Bressan,
personal communication).
3. Waiting for dehydration of the whole plant may lead to nonuniform silique dehiscence and consequent seeds loss, especially transformed seeds if the maturation of the flowers that
were sprayed with Agrobacterium is not followed closely.
4. At this stage, watering should be done carefully at the base of
the plants or by bottom infiltration. Above-canopy watering
will cause seed loss.
5. In general, with the possible exception of powdery mildew,
Thellungiella spp. seem to be less prone to pest and disease
infestations than Arabidopsis. To date, we have not observed
impatiens necrotic spot virus (INSV) on Thellungiella species.
6. Root aphid infestation is a good example of a condition that
will demonstrate to affect the transformation efficiency without alarmingly affecting the appearance of the plants.
7. It is very important to avoid overheating conditions and provide adequate shading.
8. The herbicide diluted solution can be kept for several days, in
the dark, since light promotes the herbicide degradation.
9. In case of in vitro isolation of T1 plants, using glufosinate
ammonium (Crescent Chemical Company, Islandia, NY), we
have noticed that whereas the response of A. thaliana is optimal at 5 ppm (5 mg/L), the response of T. parvula appears
more variable, displaying some plants with higher tolerance to
the herbicide. This species seems to respond more slowly to
the action of the herbicide, resulting in a suggested optimal
concentration of 10 mg/L. The response of T. salsuginea to
in vitro herbicide screening is also slower than A. thaliana.
Resistant plants can be hardened by moving them onto regular
medium without herbicide before being transplanted into soil.
Acknowledgements
Dong-Ha Oh thanks Eric Lyons for great help in setting up
T. parvula sequences in CoGe database. Dong-Ha Oh was supported by World Class University Program (R32–10148) at
Gyeongsang National University, Republic of Korea, and the NextGeneration BioGreen 21 Program (SSAC, PJ009495), Rural
Development Administration, Republic of Korea.
Arabidopsis Related Model Species (ARMS) in Salt Stress Research
49
References
1. Zhu JK (2002) Salt and drought stress signal
transduction in plants. Annu Rev Plant Biol
53:247–273
2. Pardo JM, Cubero B, Leidi EO, Quintero FJ
(2006) Alkali cation exchangers: roles in cellular
homeostasis and stress tolerance. J Exp Bot
57:1181–1199
3. Fujita Y, Fujita M, Shinozaki K, YamaguchiShinozaki K (2011) ABA-mediated transcriptional regulation in response to osmotic stress
in plants. J Plant Res 124:509–525
4. Sanders D (2000) Plant biology: the salty tale
of Arabidopsis. Curr Biol 10:486–488
5. Bohnert HJ, Cushman JC (2000) The ice plant
cometh: lessons in abiotic stress tolerance.
J Plant Growth Regul 19:334–346
6. Flowers TJ, Colmer TD (2008) Salinity tolerance in halophytes. New Phytol 179:945–963
7. Munns R, Tester M (2008) Mechanisms of salinity tolerance. Annu Rev Plant Biol 59:651–681
8. Cushman JC, Meyer G, Michalowski CB,
Schmitt JM, Bohnert HJ (1989) Salt stress
leads to differential expression of two isogenes
of phosphoenolpyruvate carboxylase during
Crassulacean acid metabolism induction in the
common ice plant. Plant Cell 1:715–725
9. Flowers TJ, Yeo A (1995) Breeding for salinity
resistance in crop plants: where next? Aust J
Plant Physiol 22:875–884
10. Kant S, Kant P, Raveh E, Barak S (2006)
Evidence that differential gene expression
between the halophyte, Thellungiella halophila,
and Arabidopsis thaliana is responsible for
higher levels of the compatible osmolyte proline and tight control of Na+ uptake in
T. halophila. Plant Cell Environ 29:1220–1234
11. Amtmann A (2009) Learning from evolution:
Thellungiella generates new knowledge on
essential and critical components of abiotic
stress tolerance in plants. Mol Plant 2:3–12
12. Al-Shehbaz IA, O’Kane SL (1995) Placement
of Arabidopsis parvula in Thellungiella
(Brassicaceae). Novon 5:309–310
13. Al-Shehbaz IA, O’Kane SL, Price RA (1999)
Generic placement of species excluded from
Arabidopsis (Brassicaceae). Novon 9:296–307
14. Zhu JK (2001) Plant salt tolerance. Trends
Plant Sci 6:66–71
15. Bressan RA, Zhang C, Zhang H, Hasegawa
PM, Bohnert HJ, Zhu JK (2001) Learning
from the Arabidopsis experience: the next gene
search paradigm. Plant Physiol 127:1354–1360
16. Orsini F, Paino D’Urzo M, Inan G et al (2010)
A comparative study of salt tolerance parameters
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
in 11 wild relatives of Arabidopsis thaliana.
J Exp Bot 61:3787–3798
Wu HJ, Zhang Z, Wang J-Y, Oh DH,
Dassanayake M, Liu B, Huang Q, Sun HX, Xia
R, Wu Y, Wang Y, Yang Z, Liu Y, Zhang W,
Zhang H, Chu J, Yan C, Fang S, Zhang J,
Wang Y, Zhang F, Wang G, Lee SY, Cheeseman
JM, Yang B, Li B, Min J, Yang L, Wang J, Chu
C, Chen SY, Bohnert HJ, Zhu JK, Wang XJ,
Xie Q (2012) Insights into salt tolerance from
the genome of Thellungiella salsuginea. Proc
Natl Acad Sci U S A 109:12219–12224
Amtmann A, Bohnert HJ, Bressan RA (2005)
Abiotic stress and plant genome evolution.
Search for new models. Plant Physiol 138:
127–130
Inan G, Zhang Q, Pinghua L et al (2004) Salt
cress: a halophyte and cryophyte Arabidopsis
relative model system and its applicability to
molecular genetic analyses of growth and
development of extremophiles. Plant Physiol
135:1718–1737
Teusink RS, Rahman M, Bressan RA, Jenks
MA (2002) Cuticular waxes on Arabidopsis
thaliana close relatives Thellungiella halophila
and Thellungiella parvula. Int J Plant Sci 163:
309–315
Gong Q, Li P, Ma S, Rupassara I, Bohnert HJ
(2005) Salinity stress adaptation competence
in the extremophile Thellungiella halophila in
comparison with its relative Arabidopsis thaliana. Plant J 44:826–839
Volkov V, Amtmann A (2006) Thellungiella
halophila, a salt-tolerant relative of Arabidopsis
thaliana, has specific root ion-channel features
supporting K+/Na+ homeostasis under salinity
stress. Plant J 48:342–353
Wang B, Davenport RJ, Volkov V, Amtmann A
(2006) Low unidirectional sodium influx into
root cells restricts net sodium accumulation in
Thellungiella halophila, a salt-tolerant relative
of Arabidopsis thaliana. J Exp Bot 57:
1161–1170
Oh DH, Gong Q, Ulanov A, Zhang Q, Li Y,
Ma W, Yun DJ, Bressan RA, Bohnert HJ
(2007) Sodium stress in the halophyte
Thellungiella halophila and transcriptional
changes in a thsos1-RNA interference line.
J Integr Plant Biol 49:1484–1496
Oh DH, Leidi E, Zhang Q et al (2009) Loss of
halophytism by interference with SOS1 expression. Plant Physiol 151:210–222
Vera-Estrella R, Barkla BJ, Garcia-Ramirez L,
Pantoja O (2005) Salt stress in Thellungiella
halophila activates Na+ transport mechanisms
50
27.
28.
29.
30.
31.
32.
33.
34.
35.
36.
37.
38.
Giorgia Batelli et al.
required for salinity tolerance. Plant Physiol
139:1507–1517
Clough SJ, Bent AF (1998) Floral dip: a simplified method for Agrobacterium-mediated
transformation of Arabidopsis thaliana. Plant J
16:735–743
Arabidopsis Genome Initiative (2000) Analysis
of the genome sequence of the flowering plant
Arabidopsis thaliana. Nature 408:796–815
Volkov V, Wang B, Dominy PJ, Fricke W,
Amtmann A (2004) Thellungiella halophila, a
salt-tolerant relative of Arabidopsis thaliana,
possesses effective mechanisms to discriminate
between potassium and sodium. Plant Cell
Environ 27:1–14
Taji T, Seki M, Satou M, Sakurai T, Kobayashi
M, Ishiyama K, Narusaka Y, Narusaka M, Zhu
JK, Shinozaki K (2004) Comparative genomics
in salt tolerance between Arabidopsis and
Arabidopsis-related halophyte salt cress using
Arabidopsis microarray. Plant Physiol 135:
1697–1709
Wong CE, Li Y, Whitty BR, Diaz-Camino C,
Akhter SR, Brandle JE, Golding GB, Weretilnyk
EA, Moffatt BA, Griffith M (2005) Expressed
sequence tags from the Yukon ecotype of
Thellungiella reveal that gene expression in
response to cold, drought and salinity shows
little overlap. Plant Mol Biol 58:561–574
Wong CE, Li Y, Labbe A, Guevara D et al (2006)
Transcriptional profiling implicates novel interactions between abiotic stress and hormonal
responses in Thellungiella, a close relative of
Arabidopsis. Plant Physiol 140:1437–1450
Hu TT, Pattyn P, Bakker EG, Cao J et al
(2011) The Arabidopsis lyrata genome
sequence and the basis of rapid genome size
change. Nat Genet 43:476–481
Dassanayake M, Oh DH, Haas JS et al (2011)
The genome of the extremophile crucifer
Thellungiella parvula. Nat Genet 43:913–918
Wang X, Wang H, Wang J et al (2011) The
genome of the mesopolyploid crop species
Brassica rapa. Nat Genet 43:1035–1040
Oh DH, Dassanayake M, Haas JS et al (2010)
Genome structures and halophyte-specific gene
expression of the extremophile Thellungiella
parvula in comparison with Thellungiella
salsuginea (Thellungiella halophila) and
Arabidopsis. Plant Physiol 154:1040–1052
Kurtz S, Philippy A, Delcher AL et al (2004)
Versatile and open software for comparing
large genomes. Genome Biol 5:R12
Krzywinski M, Schein J, Birol I, Connors J,
Gascoyne R, Horsman D, Jones SJ, Marra MA
(2009) Circos: an information aesthetic for
comparative genomics. Genome Res 19:
1639–1645
39. Darling AC, Mau B, Blattner FR, Perna NT
(2004) Mauve: multiple alignment of conserved genomic sequence with rearrangements.
Genome Res 14:1394–1403
40. Aksoy A, Hale WHG, Dixon JM (1999)
Capsella bursa-pastoris L. Medic. as a biomonitor of heavy metals. Sci Total Environ 226:
177–186
41. Madejon P, Murillo JM, Maranon T, Valdes B,
Rossini Oliva S (2005) Thallium accumulation
in floral structures of Hirschfeldia incana (L.)
Lagreze-Fossat (Brassicaceae). Bull Environ
Contam Toxicol 74:1058–1064
42. Gisbert C, Clemente R, Navarro-Avino J,
Baixauli C, Giner A, Serrano R, Walker DJ,
Bernal MP (2006) Tolerance and accumulation
of heavy metals by Brassicaceae species grown
in contaminated soils from Mediterranean
regions of Spain. Environ Exp Bot 56:19–27
43. Jimenez-Ambriz G, Petit C, Bourrie I, Dubois
S, Olivieri I, Ronce O (2007) Life history variation in the heavy metal tolerant plant Thlaspi
caerulescens growing in a network of contaminated and noncontaminated sites in southern
France: role of gene flow, selection and phenotypic plasticity. New Phytol 173:199–215
44. Bailey CD, Koch MA, Mayer M, Mummenhoff
K, O’Kane SL Jr, Warwick SI, Windham MD,
Al-Shehbaz IA (2006) Toward a global phylogeny of the Brassicaceae. Mol Biol Evol
23:2142–2160
45. Popay AI, Roberts EH (1978) Factors involved
in the dormancy and germination of Capsella
Bursa- Pastoris (L.) Medik. and Senecio Vulgaris
L. J Ecol 58:103–122
46. Pedras MSC, Montaut S, Zaharia IL, Gai Y,
Ward DE (2003) Transformation of the hostselective toxin destruxin B by wild crucifers:
probing
a
detoxification
pathway.
Phytochemistry 64:957–963
47. Johnston SJ, Pepper AE, Hall AE, Jeffrey Chen
Z, Hodnett G, Drabek J, Lopez R, James Price
H (2005) Evolution of genome size in
Brassicaceae. Ann Bot 95:229–235
48. Dittmer HJ (1949) Root hair variations in
plant species. Am J Bot 36:152–155
49. Muller K, Tintelnot S, Leubner-Metzger G
(2006) Endosperm-limited Brassicaceae seed
germination: abscisic acid inhibits embryoinduced endosperm weakening of Lepidium
sativum (cress) and endosperm rupture of cress
and Arabidopsis thaliana. Plant Cell Physiol
47:864–877
50. Santin-Montanya I, Alonso-Prados JL,
Villarroya M, Garcıa-Baudin JM (2006)
Bioassay for determining sensitivity to sulfosulfuron on seven plant species. J Environ Sci
Health B 41:781–793
Arabidopsis Related Model Species (ARMS) in Salt Stress Research
51. Weigel D, Ahn JH, Blazquez MA et al (2000)
Activation tagging in Arabidopsis. Plant Physiol
122:1003–1013
52. Wang Z, Li P, Fredricksen M, Gong Z et al
(2004) Expressed sequence tags from
Thellungiella halophila, a new model to study
plant salt-tolerance. Plant Sci 166:609–616
53. Taji T, Sakurai T, Mochida K et al (2008)
Large-scale collection and annotation of fulllength enriched cDNAs from a model halophyte, Thellungiella halophila. BMC Plant Biol
8:115
54. Zhang Y, Lai J, Sun S, Li Y, Liu Y, Liang L,
Chen M, Xie Q (2008) Comparison analysis of
transcripts from the halophyte Thellungiella
halophila. J Integr Plant Biol 50:1327–1335
55. Wang W, Wu Y, Li Y et al (2010) A large insert
Thellungiella halophila BIBAC library for
genomics and identification of stress tolerance
genes. Plant Mol Biol 72:91–99
56. Schranz ME, Lysak MA, Mitchell-Olds T
(2006) The ABC’s of comparative genomics in
the Brassicaceae: building blocks of crucifer
genomes. Trends Plant Sci 11:535–542
57. Lysak MA, Koch MA (2011) Phylogeny,
genome and karyotype evolution of crucifers
(Brassicaceae). In: Schmidt R, Bancroft I (eds)
Genetics and genomics of the Brassicaceae.
Springer, New York
58. Mandáková T, Lysak MA (2008) Chromosomal
phylogeny and karyotype evolution in x = 7
crucifer species (Brassicaceae). Plant Cell 20:
2559–2570
59. Lyons E, Pedersen B, Kane J et al (2008)
Finding and comparing syntenic regions
among Arabidopsis and the outgroups papaya,
poplar, and grape: CoGe with rosids. Plant
Physiol 148:1772–1781
51
60. Lyons E, Freeling M (2008) How to usefully
compare homologous plant genes and chromosomes as DNA sequences. Plant J 53:
661–673
61. Ansell SW, Stenøien HK, Grundmann M,
Schneider H, Hemp A, Bauer N, Russell SJ,
Vogel JC (2010) Population structure and historical biogeography of European Arabidopsis
lyrata. Heredity 105(6):543–553
62. Al-Shehbaz IA, O’Kane SL (2002) Taxonomy
and phylogeny of Arabidopsis (Brassicaceae).
In: Somerville CR, Meyerowitz EM (eds) The
Arabidopsis book. American Society of Plant
Biologist, Rockville, MD, pp 1–22
63. Mitchell-Olds T (2006) Genetic mechanisms
and evolutionary significance of natural variation in Arabidopsis. Nature 441:947–952
64. Ratcliffe DA (1994) Arabis petraea. In: Stewart
A, Pearman DA, Preston CD (eds) Scarce
plants of the British Isles. JNCC, Peterborough,
p 51
65. Sandring S, Argen J (2009) Pollinatormediated selection on floral display and flowering time in the perennial herb Arabidopsis
lyrata. Evolution 63:1292–1300
66. Thrall PH, Young AG, Burdon JJ (2000)
An analysis of mating structure in populations
of the annual sea rocket, Cakile maritima
(Brassicaceae). Aust J Bot 48:731–738
67. Barbour MG (1972) Seedling establishment of
Cakile maritima at Bodega Head, California.
Bull Torrey Bot Club 99:11–16
68. Maun MA, Lapierre J (1986) Effects of burial by
sand on seed germination and seedling emergence
of four dune species. Am J Bot 73:450–455
69. Barbour MG (1970) Germination and early
growth of the strand plant Cakile maritime.
Bull Torrey Bot Club 97:13–22
Chapter 3
Growing Arabidopsis In Vitro: Cell Suspensions,
In Vitro Culture, and Regeneration
Bronwyn J. Barkla, Rosario Vera-Estrella, and Omar Pantoja
Abstract
An understanding of basic methods in Arabidopsis tissue culture is beneficial for any laboratory working
on this model plant. Tissue culture refers to the aseptic growth of cells, organs, or plants in a controlled
environment, in which physical, nutrient, and hormonal conditions can all be easily manipulated and
monitored. The methodology facilitates the production of a large number of plants that are genetically
identical over a relatively short growth period. Techniques, including callus production, cell suspension
cultures, and plant regeneration, are all indispensable tools for the study of cellular biochemical and molecular processes. Plant regeneration is a key technology for successful stable plant transformation, while
cell suspension cultures can be exploited for metabolite profiling and mining. In this chapter we report
methods for the successful and highly efficient in vitro regeneration of plants and production of stable cell
suspension lines from leaf explants of both Arabidopsis thaliana and Arabidopsis halleri.
Key words Callus, Cell suspensions, Plant regeneration, Tissue culture, Arabidopsis, Organ regeneration
1
Introduction
Plant tissue culture is an indispensable tool for the study of cellular
biochemical and molecular processes and a key technology for successful stable plant transformation. In vitro culture, from the Latin
“in glass,” was so named for the glass vessels that the cultures were
grown in and is a term which probably came into use at the end of
the nineteenth century by embryologists. The earliest attempts at
tissue culture of plant cells were made in the first decade of the 1900s
by the Austrian Botanist Haberlandt who published his work in
German (translated into English in ref. 1). However, it wasn’t until
30 years later, following the discovery of plant growth regulators,
that the development of the technique to include auxins allowed for
the possibility of cultivating plant tissue in an aseptic environment
for an indefinite length of time [2–4]. Further advancements in
nutrient and micronutrient content, plant growth regulator (PGR)
discovery, and manipulation of ratios of PGR have all dramatically
Jose J. Sanchez-Serrano and Julio Salinas (eds.), Arabidopsis Protocols, Methods in Molecular Biology, vol. 1062,
DOI 10.1007/978-1-62703-580-4_3, © Springer Science+Business Media New York 2014
53
54
Bronwyn J. Barkla et al.
improved the efficiency and versatility of the technique to bring us
to where we are today with the ability to cultivate callus, cell suspensions, protoplasts, organs, and regenerate whole plants. Culturing
techniques provide a tightly controlled closed growth system while
facilitating the manipulation of experimental conditions. Physical,
nutritional, and hormonal states can all be easily regulated in the
closed system reducing variability and extraneous factors. The generation of plant material in this manner offers a homogenous and genetically identical pool which, through the process of subculturing, can
result in large quantities of experimental material over very short
time frames, while the sterile growth conditions ensure the material is free from pathogenic microorganisms.
In vitro culture to produce cell suspensions or regenerate plants
begins with the selection of explant material. The explant is a highly
differentiated piece of tissue (i.e., leaf pieces) harvested from
the plant, that is sterilized and placed on an artificial nutrient and
vitamin-rich, PGR-supplemented, growth medium. The wounding
of the tissue and the presence of specific amounts of PGR induce
somatic embryogenesis, and the cells in the media begin to revert
to their meristematic state, dividing rapidly and dedifferentiating to
form a mass of unorganized cells called callus. These dedifferentiated cells can either be maintained indefinitely as callus through
subculturing to prevent nutrient deficiency or, once the cells of the
callus become less packed and more friable, be transferred to liquid
medium where they dissociate into single cells to generate stable cell
suspension cultures.
Undifferentiated callus or suspension cells can be stimulated
by PGR to initiate organogenesis exploiting the ability of all plant
cells, due to their genetic potential, to dedifferentiate and then,
under defined conditions, to redifferentiate to form any plant
organ, a phenomenon known as totipotency [5]. Particularly
important in organogenesis is the ratio of the PGRs’ cytokinin and
auxin. Typically, a high ratio of cytokinin to auxin results in shoot
differentiation, whereas a high ratio of auxin to cytokinin induces
differentiation of cells to roots [6]. However, there are plants that
provide exceptions to this rule.
The first reports of Arabidopsis callus culture date back to the
1960s [7]. These were followed by articles detailing methods
for cell suspensions as well as organ and plant regeneration [8, 9].
This early methodology was applied nearly a decade later to the
regeneration of whole plants from Agrobacterium tumefacienstransformed Arabidopsis leaf explants [10]. Early contributions to
the field such as these led to the Arabidopsis molecular revolution
that continues to this date and has established this small, unassuming weed as an unrivalled model plant system [11].
Here we report methods for the efficient and successful regeneration of plants and production of stable cell suspension lines
from dedifferentiated callus produced from leaf explants of both
Arabidopsis thaliana and Arabidopsis halleri plants (see Fig. 1).
Growing Arabidopsis In Vitro
55
Step 1
Callus Induction
Sterilize Arabidopsis
tissue and section to
obtain explant material
Step 2
4 weeks
in the dark
at 25 oC
Place explants on sterile
solid culture medium to
induce callus
Step 3
2 weeks
in the dark
at 25 oC
Select friable callus and
replate onto fresh
sterile solid medium
Step 4
2 weeks
shaking (150 rpm)
in the dark
at 25 oC
Cell Suspensions
Place friable callus into
100 mL sterile liquid
medium in a 500 mL
Erlenmeyer flask
Step 5
1 week
shaking (150 rpm)
in the dark
at 25 oC
Subculture weekly by
transferring a 30 mL
aliquot into 90 mL of
sterile liquid medium
Step 6
Shoot Regeneration
Friable callus is
transferred to sterile
shoot regeneration
medium
2 weeks
in the light
at 25 oC
Step 7
Root Regeneration
Regenerated shoots are
transferred to rooting
medium
2 weeks
in the light
at 25 oC
Step 8
Regenerated plants are
transferred to pots
containing soil in a
greenhouse under
natural light at 25 °C
Fig. 1 Schematic diagram of the steps and time involved in the production of callus cultures, cells suspension
cultures, and regenerated plants from Arabidopsis
56
2
Bronwyn J. Barkla et al.
Materials
1. Dry seeds of A. halleri and A. thaliana.
2. 100 mm diameter pots.
3. MetroMix 510 soilless mixture combined with perlite (3:1).
4. 90 % (v/v) ethanol.
5. 70 % (v/v) ethanol.
6. 5 % sodium hypochlorite solutions (bleach).
7. Sterilized deionized water.
8. Microcentrifuge.
9. Petri plates 9 cm.
10. Magenta tissue culture boxes.
11. Pair of forceps.
12. Scalpel and sterile scalpel blades.
13. Bunsen burner.
14. Gamborg’s B5 vitamins [12]: 10 mg/L thiamine hydrochloride,
1 mg/L nicotinic acid, 1 mg/L pyridoxine hydrochloride,
100 mg/L myoinositol. Prepare as a 100× stock solution in sterile deionized water, and add to the medium before autoclaving.
The stock can be stored at 4 °C.
15. Basic Murashige and Skoog medium [13] (see Table 1 and
Notes 1–3). Stock solutions (1–6) are prepared in a total
volume of 100 mL.
16. Medium 1 (M1): basic MS medium, supplemented with
Gamborg’s B5 vitamins, 3 % sucrose, 1.5 % bacteriological
agar, 1 mg/L 2,4-D, and 0.05 mg/L benzylaminopurine (BA)
(see Note 4). Adjust to pH 5.7 with 1 N KOH.
17. Shoot regeneration medium, M2: basic MS medium supplemented with Gamborg’s B5 vitamins, 3 % sucrose, 1.5 %
bacteriological agar, 0.5 mg/L 2,4-D, and 0.1 mg/L BA.
18. Root regeneration medium, M3: half-strength MS medium
supplemented with half-strength Gamborg’s B5 vitamins
(5 mg/L thiamine hydrochloride, 0.5 mg/L nicotinic acid,
0.5 mg/L pyridoxine hydrochloride, 50 mg/L myoinositol),
1 % sucrose, 1 % bacteriological agar, and 0.3 mg/L 2,4-D.
Adjust the pH to 5.7 with 1 N KOH.
19. Hoagland and Arnon [14] hydroponic stock solutions: 1 M
(NH4)2HPO4 (add 1 mL/L), 1 M KNO3 (add 6 mL/L), 1 M
Ca(NO3)2⋅4H2O (add 4 mL/L), MgSO4⋅7H2O (2 mL/L),
1 mL/L of micronutrients stock (2.85 g/L H3BO3, 2.44 g/L
MnCl2⋅4H2O, 0.22 g/L ZnSO4⋅7H2O, 0.08 g/L
CuSO4⋅5H2O, 0.02 g/L H2MoO4⋅H2O), 10 mL/L of Fe2+
57
Growing Arabidopsis In Vitro
Table 1
Murashige and Skoog stock solutions and amounts added to prepare 1 L of
basic MS media
MS medium stocks
(see Note 3)
Chemicals
g/100 mL
stock
ml/L
media
Stock solution 1
NH4NO3
KNO3
MgSO4⋅7H2O
KH2PO4
16.5
19.0
3.7
1.7
10
Stock solution 2
CaCl2
4.4
10
Stock solution 3
NaEDTA⋅2H2O
FeSO4⋅7H2O
0.37
0.28
10
Stock solution 4
H3BO3
MnSO4⋅H2O
ZnSO4
0.062
0.169
0.086
Stock solution 5
KI
Na2MoO4⋅7H2O
0.083
0.025
1.0
Stock solution 6
CuSO4⋅5H2O
CoCl2⋅6H2O
0.025
0.025
0.1
10
stock solution (0.4129 g/L Na2EDTA⋅2H2O and 0.278 g/L
FeSO4⋅7H2O) (see Note 5).
20. Suspension culture medium (S1). This medium is the same as
M1 but does not contain agar.
21. Sterile 500 mL Erlenmeyer flasks with cotton plugs.
22. Pipette pump and sterile 10 mL pipettes.
23. 1 mL micropipette and sterile tips.
24. Aluminum foil.
25. Parafilm.
26. Transparent plastic cups.
27. Rotatory shaker set at 150 rpm with temperature control.
28. Growth chamber set at 25 °C.
29. Balance.
30. Microscope and microscope slides.
31. Filtration unit and sterile filters (0.22 μm).
32. Oven.
33. Autoclave (120 °C, 20 min).
34. Laminar flow hood/cabinet.
35. Neutral red stock solution (4 mg/ml in deionized water).
58
3
Bronwyn J. Barkla et al.
Methods
3.1 Working in a
Sterile Transfer Hood
A sterile working environment is critical to prevent contamination
of the plant tissue and medium by microorganisms (bacteria and
fungi). If contamination does occur these microorganisms will
rapidly colonize the media due to the high sugar and nutrient content and destroy the plant material.
1. Clean all surfaces inside sterile laminar hood/cabinet with
70 % ethanol and allow to air dry. Make sure there are no
Bunsen burners on during this step.
2. Place only the necessary material and tools inside the hood/
box and remove the material when it is no longer needed. All
material should be sterile and tools should be clean and preferably wiped with a solution of 70 % alcohol.
3. Work with your arms extended into the hood/cabinet and
your head and body outside. Try and use only the back 1/3 of
the hood as this is the most sterile area; do not obstruct the
HEPA air filter with material or body parts as this will affect
the laminar air flow and may result in contamination.
3.2 Method for
Callus Production
1. Seeds are propagated in 100 mm diameter pots in MetroMix
510 combined with perlite (3:1) for 4 weeks (A. thaliana)
or 10 weeks (A. halleri), at which time they are used in the
establishment of axenic cell culture as follows.
2. Leaves are washed by immersion in 90 % (v/v) ethanol in a
sterile Petri plate for 1 min followed by three rinses with sterile
deionized water.
3. Leaves are then surface sterilized by immersion in a 5 % sodium
hypochlorite solution and incubated for 10 min with gently
mixing.
4. Remove the sodium hypochlorite using a sterile 1 mL pipette
tip and rinse the leaves five times with sterile deionized water.
5. After elimination of the water, the sterile leaves are sectioned
into 2–4 small pieces using a sterile scalpel blade in a sterile
Petri plate.
6. The abaxial sides of the sterile sectioned leaves are placed onto
media M1, and Petri plates are sealed with parafilm, covered
with aluminum foil and incubated at 25 °C.
7. Following a period of 4 weeks friable calli are obtained
( see Fig. 2a, b and Note 6).
3.3 Method for
Regeneration of Plants
1. Friable callus (0.5 g) obtained as indicated in Subheading 3.2
is transferred aseptically to shoot regeneration medium M2, in
9 cm Petri plates, and incubated under a 16-h day length with
a photon flux density of 350 μmol m2/s at 25 °C for 4 weeks.
Growing Arabidopsis In Vitro
59
Fig. 2 In vitro culture of Arabidopsis. Callus tissue is generated from leaf explants of Arabidopsis halleri (a) and
Arabidopsis thaliana (b). Once friable callus is produced it is transferred aseptically to shoot regeneration
media (c), followed by root regeneration media (d). The final step is the removal of the fully regenerated plants
to the greenhouse (e). Friable callus can also be used to generate stable cell suspension cultures (f)
2. Callus is monitored biweekly for shoot generation. Once
shoots containing 2–3 leaf pairs have developed on the calli
(see Fig. 2c), these are selected and transferred with the callus
to a root regeneration medium M3, in deep transparent sterile
containers such as Magenta boxes (see Note 7), and placed in
a growth room under a 16-h day length with a photon flux
density of 350 μmol/m/s at 25 °C.
3. Root organogenesis is monitored until an abundant root system
is formed and the elongated shoots are approximately 2 cm tall
(see Fig. 2d). This takes an additional 4 weeks.
60
Bronwyn J. Barkla et al.
4. Plants can be transferred to either dark hydroponic containers
(to avoid algae growth) containing 0.5× Hoagland and Arnon
solution or into soil in 100 mm pots, under natural light and
humidity conditions in a greenhouse maintained at 25 °C (see
Fig. 2e and Notes 8 and 9).
3.4 Method for
Establishment of Cell
Suspension Cultures
1. Transfer aseptically the friable calli (0.5 g) obtained as indicated in Subheading 3.2 into 100 mL of sterile S1 medium in
a 500 mL Erlenmeyer flask and swirl the flask to break up the
callus tissue into small pieces.
2. Place the flask on a shaker with continuous shaking (150 rpm),
in the dark at 25 °C (see Note 10).
3. After a 2-week period, transfer aseptically 10 mL of cells into
a sterile 500 mL Erlenmeyer flask containing 90 mL fresh S1
medium by using a 10 mL sterilized glass pipette connected
to a pipette pump. Place the flask back onto the shaker (see
Note 11).
4. Subculture the cell suspensions every 7 days to maintain the
cells in the log phase of growth (see Fig. 2f and Note 12).
4
Notes
1. It is more economical and allows for easier manipulation of
nutrients if MS medium is made from scratch as described in
Table 1. However, it can also be purchased from several sources
in pre-weighed packets.
2. CaCl2 will precipitate if added to stock 1. Therefore, make it as
an individual stock as indicated in Table 1.
3. All MS media solutions are stored at 4 °C with the exception
of solution 1 which is maintained at room temperature to
prevent solidification at the colder temperature.
4. Stocks of growth regulators are prepared at a 1 mg/mL
concentration in sterile deionized water and added before
autoclaving. These stocks are stored at 4 °C.
5. Hoagland’s solutions are sterilized for 20 min at 120 °C with
the exception of the Ca(NO3)2 and Fe2+ solutions that must
be sterilized by filtration. In addition the Fe2+ solution must be
heated before filter sterilization to oxidize the ferrous.
6. Calli are considered friable when the cells separate easily from
the mass and are no longer dense and compacted.
7. Magenta tissue culture boxes are commonly used, but
economic replacements are glass baby food jars with lids that
can be sterilized.
Growing Arabidopsis In Vitro
61
8. To avoid rapid dehydration and plant stress, newly transferred
plantlets need to acclimatize to lower humidity levels and
should therefore be covered with small transparent plastic cups
to maintain adequate humidity. Small holes can be punched
into the covers to gradually decrease the humidity over a period
of 1 week to that of the atmosphere. This ensures a 100 %
survival rate of the regenerated plants.
9. Arabidopsis plants grown in hydroponics do not require aeration
of the roots.
10. It is important to remove a small aliquot of cells from the culture
to visualize under a microscope and check for cell viability every
2 days. Cell viability can be observed using a drop of neutral red
dye. Cells which are viable will accumulate the dye.
11. The top of the Erlenmeyer flask containing the fresh media
should be flame sterilized after removing the plug to create an
upward hot air draft which directs particles away from the
opening. This is repeated after adding the 10 mL of cells before
replacing the plug.
12. To culture the cell suspensions for more than 8 days results in
rapid browning and cell death as the availability of nutrients
diminishes with culturing time. It is recommended to perform
a growth curve to determine the cell culture doubling time. It
is important to consider that the cell suspension growth varies
in each species or cultivar.
Acknowledgments
Work in the authors’ lab is funded by DGAPA IN203913 to B.J.B.,
DGAPA 203711 to R.V.-E., and DGAPA 203112 and CONACyT
IN79191 to O.P.
References
1. Krikorian AD, Berquam DL (1969) Plant cell
and tissue cultures: the role of Haberlandt. Bot
Rev 35:59–67
2. Gautheret RJ (1937) Nouvelles recherches sur
la culture du tissu cambial Cr hebd. Seanc Acad
Sci 205:572–574
3. Nobécourt P (1937) Culture en serie de tissus
vegetaux sur milieu artificiel. Cr hebd. Seanc
Acad Sci 20:521–523
4. White PR (1939) Potentially unlimited growth
of excised plant callus in an artificial medium.
Am J Bot 26:59–64
5. Steward FC (1958) Growth and organized
development of cultured cells. II. Organization
6.
7.
8.
9.
in cultures grown from freely suspended cells.
Am J Bot 45:705–708
Brown JT, Charlwood BV (1990) Organogenesis
in callus culture. Methods Mol Biol 6:65–70
Loewenberg JR (1965) Callus cultures of
Arabidopsis. Arabidopsis Inf Serv 2:34
Negrutiu I, Beeftink F, Jacobs M (1975)
Arabidopsis thaliana as a model system in
somatic cell genetics I. Cell and tissue culture.
Plant Sci Lett 5:293–304
Negrutiu I, Jacobs M (1975) Arabidopsis
thaliana as a model system in somatic cell
genetics II. Cell suspension culture. Plant Sci
Lett 8:7–15
62
Bronwyn J. Barkla et al.
10. Lloyd AM, Barnason AR, Rogers SG, Byrne MC,
Fraley RT, Horsch RB (1986) Transformation
of Arabidopsis thaliana with Agrobacterium
tumefaciens. Science 234:464–466
11. Leonelli S (2007) Arabidopsis, the botanical
Drosophila: from mouse cress to model organism. Endeavour 31:34–38
12. Gamborg O, Miller R, Ojima K (1968)
Nutrient requirements of suspension cultures
of soybean root cells. Exp Cell Res 50:
151–158
13. Murashige T, Skoog F (1962) A revised
medium for rapid growth and bioassay with
tobacco tissue cultures. Physiol Plant 15:
473–479
14. Hoagland DR, Arnon DI (1938) The water
culture method for growing plants without
soil. Calif Agric Exp Station Circ 347:1–39
Part II
Arabidopsis Resources
Chapter 4
Arabidopsis Database and Stock Resources
Donghui Li, Kate Dreher, Emma Knee, Jelena Brkljacic, Erich Grotewold,
Tanya Z. Berardini, Philippe Lamesch, Margarita Garcia-Hernandez,
Leonore Reiser, and Eva Huala
Abstract
The volume of Arabidopsis information has increased enormously in recent years as a result of the sequencing
of the reference genome and other large-scale functional genomics projects. Much of the data is stored in
public databases, where data are organized, analyzed, and made freely accessible to the research community. These databases are resources that researchers can utilize for making predictions and developing
testable hypotheses. The methods in this chapter describe ways to access and utilize Arabidopsis data and
genomic resources found in databases and stock centers.
Key words Data mining, Database, Genomics, Gene expression, Bioinformatics, Computational biology,
Stocks, Arabidopsis thaliana
1
Introduction
Arabidopsis thaliana serves as the primary model system for many
aspects of plant biology. It was the first plant to have its entire
nuclear genome sequenced [1]. Following the completion of the
Arabidopsis genome sequencing in 2000, the international
Arabidopsis community set an ambitious goal to determine the
function of every Arabidopsis gene by the year 2010 [2]. Numerous
laboratories internationally have taken part in this project
(Multinational Coordinated Arabidopsis thaliana Functional
Genomics Project). Large amounts of data about gene function,
expression, metabolism, and protein and gene interactions have
been generated by these labs. To accomplish the task of organizing
and managing the data, lab consortia and individual labs have
created databases to store the information generated and make it
available to the research community. Community resources such
as genome-wide DNA clones and knockout mutant libraries
(e.g., SALK T-DNA insertion lines) were also created [3]. There
are now extensive tools and resources for storage, curation, and
Jose J. Sanchez-Serrano and Julio Salinas (eds.), Arabidopsis Protocols, Methods in Molecular Biology, vol. 1062,
DOI 10.1007/978-1-62703-580-4_4, © Springer Science+Business Media New York 2014
65
66
Donghui Li et al.
retrieval of Arabidopsis data and DNA and seed stocks. Scientists
doing research in this “postgenomic” era are compelled to know how
to make use of these resources to find the relevant information and
stocks needed to further their research.
In this chapter, we describe how to use databases to find what
is known about Arabidopsis and to make inferences and predictions
that can later be tested experimentally. We include a summary of
the rationale, a brief description of the database/tool(s), and the
specific steps for querying, retrieving, and interpreting the data.
Methods on how to search and order DNA or seed stocks are also
provided. The methods, along with the corresponding databases
and tools, are outlined in Table 1. This table of contents can be
used to find specific methods of interest within the chapter.
Databases described here represent a small portion of the vast
collection of databases and bioinformatics resources available for
Arabidopsis researchers. In this chapter, we focus on well-developed
resources that provide comprehensive Arabidopsis data (including
stocks) such as TAIR (The Arabidopsis Information Resource) [4–
8] and ABRC (Arabidopsis Biological Resource Center) [9]. There
are many more databases that focus on specific types of Arabidopsis
information such as subcellular localization (SUBA: SUB cellular
location database for Arabidopsis proteins, http://suba.plantenergy.uwa.edu.au/) [10], whereas others focus on specific classes of
genes or disseminate data from a functional genomics project, e.g.,
the Chloroplast 2010 database (http://www.plastid.msu.edu/)
[11]. Many links to these external resources and US National
Science Foundation 2010 Arabidopsis functional genomics project
pages (http://www.arabidopsis.org/portals/masc/projects.jsp)
are provided on the TAIR Portal pages (http://www.arabidopsis.
org/portals/). There is also currently an ongoing effort aiming to
integrate all Arabidopsis database resources by the proposed
International Arabidopsis Informatics Consortium [12]. In addition to databases that are entirely devoted to Arabidopsis
(“Arabidopsis specific”), there are also numerous multi-species
databases containing Arabidopsis data along with information about
other organisms, such as the National Center for Biotechnology
Information’s (NCBI) GenBank (http://www.ncbi.nlm.nih.gov/
genbank/), the European Bioinformatics Institute’s (EBI) InterPro
(http://www.ebi.ac.uk/interpro/), UniProt (http://www.uniprot.org/), and PlantGDB (http://www.plantgdb.org/), to name
a few. Some of these databases are listed in Table 1. This chapter
does not intend to cover all these databases in depth; instead we
hope it will serve as a good starting point for anyone who wishes to
explore these valuable resources.
Arabidopsis seed and DNA stocks and other biological materials can be obtained from a number of different institutions around
the world. These stock centers provide different kinds of materials
and different levels of service. The Arabidopsis Biological Resource
URL
http://www.associomics.org/Associomics/
Home.html
http://www.arabidopsis.org/tools/
nbrowse.jsp
http://thebiogrid.org/
http://plants.ensembl.org/Arabidopsis_
thaliana/Info/Index
http://suba.plantenergy.uwa.edu.au/
InterPro
1001 Genomes
http://www.ebi.ac.uk/interpro/
http://1001genomes.org/
Finding gene sequence and structure data
TAIR: Sequence Bulk Download and
http://www.arabidopsis.org/tools/bulk/
Analysis
sequences/index.jsp
TAIR: GBrowse
http://gbrowse.tacc.utexas.edu/cgi-bin/
gb2/gbrowse/arabidopsis/
TAIR: WU-BLAST
http://www.arabidopsis.org/wublast/
index2.jsp
TAIR: Bulk Protein Download
http://www.arabidopsis.org/tools/bulk/
protein/index.jsp
Phytozome
http://www.phytozome.net/
BioGRID (Biological General Repository
for Interaction Datasets)
MIND (Membrane protein interaction
database)
SUBA (SUBcellular location database for
Arabidopsis proteins)
TAIR: NBrowse
Ensembl Plants Genome Browser
Finding comprehensive information about Arabidopsis genes
TAIR: Gene Search
http://www.arabidopsis.org/servlets/
Search?action=new_search&type=gene
NCBI: Gene Search
http://www.ncbi.nlm.nih.gov/gene/
Database: tool
Table 1
Selected Arabidopsis databases and stock resources
(continued)
Comparative genomic database providing access to 25 green
plant genomes which have been clustered into gene families
Finding predicted protein signature (domain) information
Arabidopsis thaliana genetic variation database
3.2.4. Finding protein structure and domain information
3.2.3. Finding related DNA or protein sequences
3.2.2. GBrowse
3.2.1. Retrieving DNA and protein sequence data
Finding Arabidopsis membrane interactome data
Finding protein–protein interaction information
Finding protein–protein interaction information
Finding genes in NCBI’s Reference Genome Collection.
Search by locus identifiers, symbol, etc.
Multi-species plant genome database providing access to
Arabidopsis and other plant genomic data
Finding protein subcellular location information
3.1.1. Finding genes in TAIR by name
Protocol/description
Database and Stock Resources
67
URL
Protocol/description
AraCyc/PlantCyc/PMN
http://www.plantcyc.org
Obtaining information about metabolism in Arabidopsis
KEGG (Kyoto Encyclopedia of Genes and http://www.genome.jp/kegg/
Genomes)
Kazusa Plant Pathway Viewer
http://kpv.kazusa.or.jp
(KaPPA-View4)
KNApSAck
http://kanaya.naist.jp/KNApSAcK/
MetNet
http://metnetonline.org/
Arabidopsis Reactome
http://www.arabidopsisreactome.org/
MetaCrop
http://metacrop.ipk-gatersleben.de.
TAIR: Plant Ontology Search
http://www.arabidopsis.org/tools/bulk/
po/index.jsp
TAIR: Microarray Expression Search
http://www.arabidopsis.org/servlets/
Search?action=new_search&type=expression
Plant Ontology Consortium Database
http://www.plantontology.org/
NASCArrays
http://affy.arabidopsis.info/narrays/
experimentbrowse.pl
ArrayExpress
http://www.ebi.ac.uk/arrayexpress/
NCBI GEO (Gene Expression Omnibus) http://www.ncbi.nlm.nih.gov/geo/
Genevestigator
https://www.genevestigator.com
eFP Browser
http://bar.utoronto.ca/efp/cgi-bin/
efpWeb.cgi
Finding information about gene expression
3.5. Obtaining information about metabolism in Arabidopsis
3.5. Obtaining information about metabolism in Arabidopsis
3.5. Obtaining information about metabolism in Arabidopsis
3.5. Obtaining information about metabolism in Arabidopsis
3.5. Obtaining information about metabolism in Arabidopsis
3.5. Obtaining information about metabolism in Arabidopsis
3.5. Obtaining information about metabolism in Arabidopsis
Searching or browsing PO and PO annotations
Finding microarray data from the European Arabidopsis Stock
Center’s microarray database
European Bioinformatics Institute’s microarray database
NCBI’s gene expression data repository
Analysis tool for mining microarray datasets
Analysis tool for mining microarray datasets
3.4.2. Finding DNA microarray data
3.4.1. Finding Plant Ontology annotations
Finding Gene Ontology (GO) annotations
TAIR: Gene Ontology Annotations
http://www.arabidopsis.org/tools/bulk/
3.3.1. Finding GO annotations
Search
go/index.jsp
TAIR: Keyword Search
http://www.arabidopsis.org/servlets/
3.3.2. Finding genes annotated to related functions or
Search?action=new_search&type=keyword
processes
http://amigo.geneontology.org/cgi-bin/
AmiGO
Searching the Gene Ontology database
amigo/go.cgi
Database: tool
Table 1
(continued)
68
Donghui Li et al.
http://www.ncbi.nlm.nih.gov/pubmed/
http://arabidopsis.org/servlets/
Search?action=new_
search&type=publication
http://www.textpresso.org/arabidopsis/
Submitting your data or DNA/seed stocks
TAIR: Submit Data
http://www.arabidopsis.org/submit/
index.jsp
http://www.arabidopsis.org/doc/submit/
functional_annotation/123
PMN: Submit Data
http://www.plantcyc.org/feedback/data_
submission.faces
ABRC: Stock Donation
https://abrc.osu.edu/donate-stocks
TAIR: Textpresso Full-Text Search
Searching literature databases
NCBI PubMed Database
TAIR: Publication Search
Finding and ordering seed and other resources
ABRC: Stock Catalog
http://www.arabidopsis.org/servlets/
Order?state=catalog
TAIR/ABRC: Seed Germplasm Search
http://www.arabidopsis.org/servlets/
Search?action=new_
search&type=germplasm
http://www.arabidopsis.org/servlets/
TAIR/ABRC: DNA/Clones Search
Search?action=new_search&type=dna
NASC (European Arabidopsis Stock
http://arabidopsis.info/
Centre)
RIKEN Biological Resource Center
http://www.brc.riken.jp/lab/epd/Eng/
Experimental Plant Division (Japan)
catalog/seed.shtml
French National Institute for Agricultural http://www-ijpb.versailles.inra.fr/en/cra/
research (INRA) Arabidopsis Resource
cra_accueil.htm
Center for Genomics
http://www.gabi-kat.de/
Bielefeld University
SIGnAL(Salk Institute Genomic Analysis http://signal.salk.edu/cgi-bin/tdnaexpress
Laboratory): T-DNA Express
3.9.3. Donating seed and DNA stocks to ABRC
3.9.2. Submitting data to PMN
3.9.1. Submitting data to TAIR
3.8.3. Searching full-text literature
3.8.1. Finding articles in PubMed
3.8.2. Finding publications in TAIR
Providing Arabidopsis T-DNA lines (GABI-Kat lines)
Finding T-DNA insertion sites
3.7. Finding and ordering other (non-seed) resources from
ABRC
Finding and ordering seed and clone stocks from the
European Arabidopsis Stock center
Providing Arabidopsis transposon-tagged lines and activation
tagging lines
Providing Arabidopsis T-DNA lines (FLAG lines)
3.6. Finding and ordering seed resources from ABRC
3.6. Finding and ordering seed resources from ABRC
Database and Stock Resources
69
70
Donghui Li et al.
Center (ABRC), located in North America, and the European
(Nottingham) Arabidopsis Stock Centre (NASC) represent the
two largest stock centers and essentially mirror each other’s seed
collections. The collections of both centers will be discussed in
more detail in the next section (Subheadings 2.2.2 and 2.2.3).
The RIKEN BioResource Center (BRC) Experimental Plant
Division in Japan has some unique resources, e.g., lines overexpressing Arabidopsis full-length cDNAs (FOX), and operates under
the restriction of Material Transfer Agreements (MTAs) (Table 1).
The French National Institute for Agricultural Research (INRA) in
France and the Bielefeld University in Germany distribute locally
developed collections of T-DNA lines (FLAG and GABI-Kat,
respectively) [13, 14]. Although historically both institutions
restricted the distribution by requiring an MTA, these restrictions
have been lifted either completely or for the greater part of their
collections.
As with any web-based informatics resource, database content and
tools change over time. The protocols described here use tools and
data available in databases and stock centers as of December 2011.
2
Materials
Programming experience is an asset to a scientist who wishes to
analyze and manipulate complex, large datasets, but it is not essential to effectively mine databases. Anyone with access to the internet
and a reasonably up-to-date computer should be able to perform
all the steps in the protocols. A basic familiarity with computers,
Internet browsers, and commonly used bioinformatics tools such
as BLAST is assumed. There are a wide variety of textbooks, manuals, and web-based tutorials available for learning the basics of
bioinformatics.
2.1 Computer
Hardware and
Software for Database
Mining
The minimum requirements for database mining are a personal
computer (PC), an internet connection, and web browsing software.
A high-speed network connection is desirable to ensure faster data
access. Up-to-date web browser software, such as Internet Explorer,
Firefox, or Safari, is also required. Database interfaces should
behave the same regardless of what operating system or browser is
used. However, some functions may not work properly on older
browsers. If possible, you should upgrade your browser to the most
recent version available that can run on your operating system.
The browser must have cookies enabled if users want to log in and
place stock orders through TAIR. JavaScript must also be enabled
to use TAIR since TAIR makes extensive use of this feature.
See http://www.arabidopsis.org/help/index.jsp for information
on properly configuring your browser. Note that for other databases mentioned in this chapter, there may be specific browser
preferences.
Database and Stock Resources
71
2.2 Databases
and Stock Centers
Databases are information storage and retrieval software systems.
Typically, databases have three components: the database software
for storing data, software that translates and executes requests
(queries), and software applications that allow users to view data.
This section describes three commonly used Arabidopsis resources.
Additional databases can be found in Table 1.
2.2.1 The Arabidopsis
Information Resource
TAIR (http://www.arabidopsis.org) is a comprehensive web
resource for the biology of A. thaliana [4–8]. It provides a centralized, curated gateway to Arabidopsis biology, research materials,
and community members. Data available from TAIR includes the
complete Arabidopsis genome sequence along with gene structure,
gene product information, metabolism, expression data, genome
maps, genetic and physical markers, publications, and information
about the Arabidopsis research community. In addition, seed and
DNA stock information and ordering from the Arabidopsis
Biological Resource Center (ABRC) are fully integrated into TAIR.
TAIR is a curated database; data are processed by biocurators with
biology Ph.Ds who ensure their accuracy. TAIR data come from a
variety of sources including in-house manual curation of published
literature and sequence data, locally run computational pipelines
for annotating gene structure and function, integration of data
from other biological databases and resources (GenBank, ABRC,
Gene Ontology Consortium, etc.), and submissions from the
research community. TAIR also provides researchers with an extensive set of data visualization and analysis tools. A comprehensive
guide on how to use TAIR is available [15].
2.2.2 The Arabidopsis
Biological Resource Center
The ABRC collects, preserves, reproduces, and distributes diverse
seed and other resources for A. thaliana and related species. The
center is located at The Ohio State University in Columbus, Ohio,
USA. The ABRC serves a dynamic community of plant researchers
with a common goal to understand the basic processes of flowering
plants, as well as to apply this understanding to further crop
improvement. Seed stocks at the ABRC include classical mutants
(see Note 1), natural accessions, T-DNA and transposon insertion
collections, mapping populations, the TILLING collection, and
seeds from related species (e.g., Arabidopsis arenosa, Brassica rapa,
Capsella rubella). Other resources include cell suspension cultures,
protein chips, full-length cDNA and ORF clones in recombinationready and expression vectors, expressed sequence tagged (EST)
and bacterial artificial chromosome (BAC) clones of Arabidopsis
and related species, phage and plasmid libraries, and diverse vectors
for cloning and expression. In addition, the ABRC has recently
started distributing educational resources. Due to a large demand,
this type of resource will be expanded further. This example
illustrates how the resources provided by the ABRC closely track
the emerging needs of the community. Seed resources are
exchanged with the European Arabidopsis Stock Centre (NASC)
72
Donghui Li et al.
in Nottingham, UK. Researchers in the Americas are required to
order seed stocks from ABRC, while researchers in Europe are
required to order seeds from the NASC, but both can order DNA
and other stocks from either center. Researchers outside of the
Americas and Europe may order seed and other resources from
either the ABRC or the NASC. The ABRC stock information and
ordering system are hosted by TAIR (http://www.arabidopsis.
org), and all functions can be accessed through the ABRC Stocks
drop-down menu on the right side of the menu bar at the top of
most TAIR pages.
2.2.3 The European
Arabidopsis Stock Centre
(NASC)
3
The European (Nottingham) Arabidopsis Stock Centre (NASC)
provides Arabidopsis seed and information resources to the plant
research community in coordination with the ABRC as described
in the previous section. The NASC’s stock collection includes
seeds of A. thaliana and related species, tomato seed resources,
DNA clones, and diverse cloning vectors. In addition, the NASC
provides an International Affymetrix GeneChip hybridization service for a wide range of species including Arabidopsis and many
other plants [16]. The data they collect through their hybridizations as well as other user-supplied Arabidopsis data are made
publicly available through their NASCArrays database. The NASC
stock information, ordering, and NASCArrays database are available at http://www.arabidopsis.info.
Methods
A primary objective of database mining for most researchers is to
find out everything that is known about a specific gene or set of
genes. Some of the basic questions are the following: What’s the
sequence and structure of my gene? What type of protein does my
gene encode? In what biological processes is it involved? With what
other genes/proteins does it interact? In what tissues is it located
and how is it regulated? In order to generate a testable hypothesis
and design meaningful experiments, the current available knowledge must be obtained and analyzed.
3.1 Finding
Comprehensive
Information About
Arabidopsis Genes
After over 12 years of development, TAIR now serves as a central
access point for Arabidopsis data. The TAIR home page (http://
www.arabidopsis.org) is the main entry point to the database.
The navigation toolbar provides easy access to the eight major
functionalities: Search, Browse, Tools, Portals, Download, Submit,
News, and ABRC Stocks. When mousing over each item in the
toolbar, a drop-down menu appears with clickable submenus that
lead to a variety of dataset, tools, and external links. Log-in is not
required for searching and viewing data but is required for ordering
DNA or seed stocks from the ABRC and for submitting gene
Database and Stock Resources
73
functional data. Here, we describe how to use the TAIR Gene
Search tool and locus detail page to find information about
Arabidopsis genes.
3.1.1 Finding Genes
in TAIR by Name
TAIR’s locus detail page represents the most comprehensive starting
point for a researcher to find out what is known about a gene.
There are two commonly used ways to find genes and to get to the
locus detail page: using the quick search and advanced Gene
Search form.
1. To perform a quick search, go to the header on any TAIR page
that has a quick search tool in the upper right corner. Enter the
gene name (e.g., ABI3 or AT3G24650) in the text box and
use the default “Gene” option on the drop-down menu. Click
Search. A list of all matching records is displayed on a page
titled TAIR Gene Search Results (see Note 2).
2. To perform a gene search using the advanced Gene Search form,
on any TAIR page with a top navigation bar, select “Genes”
from the Search drop-down menu (http://www.arabidopsis.
org/servlets/Search?action=new_search&type=gene).
3. Define the name search criteria. To search by name, choose
“Gene name” as the option from the Search Name drop-down
menu. This option is used to search by symbolic names (e.g.,
ABI3), full names (e.g., ABA INSENSITIVE 3), or AGI locus
identifiers (e.g., AT3g24560). AGI (Arabidopsis Gene
Identifier) locus identifiers are systematic names assigned based
on chromosomal location.
4. Choose an exact or inexact search mode. When searching with
a gene symbol choosing the “starts with” option is a way to
find similarly named genes, such as members of a gene family
(e.g., ARF for Auxin Response Factor family). When searching
with a GenBank accession, it is better to use an exact match in
order to avoid retrieving spurious results. To search for a word
or phrase within a gene description, switch from a “Gene name”
search to a “description” search and choose the “contains”
option.
5. Select the output format. The default values are 25 records,
sorted by name. The position option can be used when finding
genes by location.
6. Click “Submit Query” to start your search. All of the loci that
match the query term will be displayed in a list of results (on a
page titled TAIR Gene Search Results). Click on the locus
name to view the locus detail page.
3.1.2 Using TAIR’s Locus
Detail Page to Find
Information About a Gene
The locus detail page contains a wealth of information about a
gene including its sequence, and function, and associated polymorphism, mutant phenotypes, and publication. This page also includes
74
Donghui Li et al.
links to a large number of external databases and tools. To see an
example locus detail page, go to http://www.arabidopsis.org/
servlets/TairObject?type=locus&name=AT3G24650. This section
describes the typical data types displayed on the locus page.
1. Gene summary information: TAIR uses the AGI locus identifier (e.g., AT3G24650) as the primary gene name. Other
names including both abbreviated gene symbols and the corresponding full names are displayed in the Other Names section. The Description field provides a short summary of the
gene’s function either manually composed by a curator or
computationally generated (see Note 3).
2. Gene model information: A locus in TAIR refers to the physical
location of an annotated gene on the chromosome. One locus
can have several gene models or splice variants associated to it
based on alternatively spliced mRNAs (e.g., At5g01810.1,
At5g01810.2, At5g01810.3). The representative gene model
for a protein coding gene is the gene with the longest coding
sequence (CDS); for other gene types, the representative model
is set as default to the .1 model. The Gene model page contains
model-specific information such as exon–intron positions, protein domains, and gene model-specific function information.
The Map Detail Image section displays the exon–intron structures of all gene models of a locus. Clicking on the image directs
the user to GBrowse (see Subheading 3.2.2).
3. Gene function annotations: The Annotations section displays all
of the Gene Ontology (GO) [17] and Plant Ontology (PO)
[18] controlled vocabulary terms that describe the function and
expression of the gene product. GO terms describe the molecular function, biological process, and subcellular localization of
the gene product, while the PO consists of growth and development stages and plant structure terms capturing the temporal
and spatial expression of the gene product. Detailed information including references and supporting evidence can be
obtained by clicking on the “Annotation Detail” link located at
the bottom of this section. How to find GO and PO annotations is described later (Subheadings 3.3.1 and 3.4.1).
4. Gene expression: Information about gene expression can be
found in the Plant Ontology annotations section and in the
RNA Data section. In the RNA Data section, array elements
from pre-2005 one-channel and/or two-channel microarray
experiments that map to the locus are listed. For elements
whose expression has been analyzed across all experiments,
the average log ratio of expression values, along with standard
error, is provided. For these elements, links to the Expression
Viewer (for finding similarly expressed genes) and Spot
History (only available for microarray elements from arrays in
the Stanford Microarray Database) are also available [15].
Database and Stock Resources
75
Please note that no new microarray expression datasets have
been entered into TAIR since June 2005; instead TAIR provides
links to high-quality gene expression resources in its External
Links section on every locus page. The Associated Transcripts
subsection within the RNA Data section lists full-length cDNAs
and expressed sequence tags (ESTs) associated with the locus.
5. Nucleotide sequence: Links to the full-length CDS and fulllength cDNA of the representative gene model plus the fulllength genomic sequence are provided in this section.
6. Protein Data: This section displays the structural and physical
characteristics of the protein encoded by the representative
gene model, including length (amino acid), predicted molecular
weight, isoelectric point, and domains. Click on the AGI name
in the protein section to open a Protein detail page. Protein
detail page provides more detailed information and the amino
acid sequence for the representative gene model. To view
nucleotide and protein data for other gene models, go to the
specific gene model page.
7. Map Locations: This section displays the chromosome and
coordinates of the locus for the maps on which it is found.
TAIR provides three tools to view a gene in a whole-genome
context: Map Viewer, Sequence Viewer, and GBrowse.
8. Polymorphism: This section contains all of the polymorphisms
mapped to the locus. Both natural variations found in different
ecotypes and induced mutations (e.g., T-DNA insertions) are
shown. Note that by default this section only displays 15
entries, but a complete list can be obtained by clicking on the
“See All” link right under this section’s name.
9. Germplasm: This section provides information on all germplasms currently in the database associated with a locus and
includes phenotype descriptions, mutant images, stock numbers, and ordering options when available.
10. External Link: TAIR links extensively to external sites that offer
either alternative views of or different information about a
locus, e.g., other Arabidopsis genome annotation databases,
gene expression databases, functional genomics sites, and data
analysis tools. Links to external sites and tools can also be found
on the Portal pages (http://www.arabidopsis.org/portals/
index.jsp).
11. Comments: This section contains statements contributed by
registered TAIR users. Comments can be added to nearly all of
the TAIR detail pages. This function can be used to report new
data, errors, or omissions related to the displayed object.
12. Publications: Publications include published literature
imported from PubMed, Agricola, and BIOSIS, along with
abstracts from the International Conference on Arabidopsis
76
Donghui Li et al.
Research (ICAR). Only 15 entries are initially displayed on the
locus page. At the bottom of the Publications section, click on
“View Complete List” to see all records. Click on the title of
the publication to view a detailed publication record that
provides a link to the corresponding PubMed abstract or publication text when available.
3.2 Finding Gene
Sequence and
Structure Data
The primary source of Arabidopsis gene sequence and structure
data at many biological databases is TAIR. In an ongoing effort to
improve the annotation of the Arabidopsis genome, TAIR has
released updated versions of the Arabidopsis gene set on a yearly
basis since 2005 [7, 8]. TAIR’s genome annotation is widely distributed to other major databases such as GenBank and UniProt.
Therefore, these databases often have overlapping datasets. Here, it
is described how to find sequence and structure data from TAIR.
TAIR provides gene sequence and structural data (i.e., the
exon–intron architecture of a gene) in a variety of formats. DNA
and protein data for an individual gene can be found on the locus
and gene model pages (see Subheading 3.1.2). For those users
interested in downloading complete sequence datasets, the TAIR
ftp site provides sets of sequence files in FASTA format organized by
TAIR release (e.g., TAIR10, TAIR9) and data types (e.g., coding
sequence or CDS, cDNA, genomic DNA, and promoter regions) at
ftp://ftp.arabidopsis.org/home/tair/Genes/. For users looking
for a subset of gene sequences, the Sequence Bulk Download and
Analysis tool generates sequence files based on a list of AGI locus
identifiers (see Subheading 3.2.1).
In addition to sequence-based information, TAIR also provides
structural information about each gene. The complete set of
genome coordinates of each feature (such as exon, CDS, and 5′
untranslated region or 5′UTR) for every gene in the TAIR genome
release is available in GFF3 format (ftp://ftp.arabidopsis.org/
home/tair/Genes/TAIR10_genome_release/TAIR10_gff3/).
For a visual snapshot of a gene’s exon–intron structure, users can
refer to the TAIR locus page where a graphic displays the structural
architecture of each splice variant annotated at that locus. TAIR also
offers two different genome browsers: GBrowse and SeqViewer.
While both browsers allow users to explore their genomic region of
interest, the browsers are quite distinct and are used for different
purposes. GBrowse is especially useful for analyzing a wide variety of
data types that overlap with a chromosomal/gene region of interest. The tool contains a menu of datasets divided into sections such
as expression data and sequence similarity data, which users can
select to visualize in the main browser window. SeqViewer lends
itself especially well for nucleotide-based analysis. Users can search
the genome using either a name or a sequence, and thanks to the
detailed “SeqViewer Nucleotide View,” users can have a detailed
look at the corresponding genome-based nucleotide sequence
Database and Stock Resources
77
decorated with annotated genes, T-DNA insertions from the SALK
T-DNA insertion lines and other mutant collections, polymorphisms, and more. Detailed instructions on how to use SeqViewer
are described elsewhere [15].
3.2.1 Retrieving DNA and
Protein Sequence Data
TAIR’s Sequence Bulk Download and Analysis tool allows the user
to retrieve DNA and protein sequence data in bulk for a list of
genes (or a single gene).
1. On any TAIR page with a top navigation bar, select “Bulk Data
Retrieval” from the Tools drop-down menu. Then select
“Sequences.” Alternatively go directly to the URL http://
www.arabidopsis.org/tools/bulk/sequences/index.jsp.
2. Enter individual or a set of AGI locus or gene model identifiers
(e.g., At5g01810, AT1G01040.2) into the text box or upload
a text file containing AGI locus or gene model identifiers.
Select the desired data type from the Dataset drop-down menu
(e.g., transcripts, coding sequence, and genomic locus
sequences). This tool allows the user to retrieve sequences for
the representative gene model, all gene models, or only the
gene model matching the user query.
3. Select the FASTA or tab-delimited text output options. Click
on “Get Sequences” to perform the search. More information
on how to use the tool can be found by following the link to
the Help document.
4. This tool can also be accessed whenever a user generates a
“Gene Search Results” page by clicking on the “get all
sequences” or “get checked sequences” button at the top of
the page.
3.2.2 Searching for a
Gene, Its Overlapping
Transcripts, and
Orthologous Genes in
Other Organisms Using
GBrowse
1. On any TAIR page with a top navigation bar, select GBrowse
from the Tools drop-down menu (http://gbrowse.tacc.utexas.
edu/cgi-bin/gb2/gbrowse/arabidopsis/). The GBrowse display is divided into five main sections: (1) Instructions, which
provides directions and examples of GBrowse search queries;
(2) Search, which allows the user to enter a query and select a
data source; (3) Overview, which shows a graphical representation of the chromosome and region currently displayed; (4)
Details, which provides a graphical representation of the
genomic features in the selected region; and (5) Tracks, which
allow the user to customize the display settings and select
which features are displayed in the detail section [15].
2. To select a region of the genome to view, enter its name in
the “Landmark or Region” search box (e.g., At1g01040).
Select the desired dataset from the “Data Source” drop-down
menu. The most recent TAIR genome release is the default
data source. Clicking on “Search” will update the overview
78
Donghui Li et al.
and the detail display. Use the “Scroll/Zoom” feature to move
along the chromosome or display a larger-/smaller-scale view
of the genome.
3. Customize the GBrowse display. TAIR GBrowse has 11 track
categories: Assembly, Community/Alternative Annotation,
DNA, Expression, Gene, Genomic Features, Methylation and
Phosphorylation, Orthologs and Gene Families, Sequence
Similarity, Variation, and Analysis. New tracks may be added in
the future. Each track category has multiple check boxes for
different types of data. Mouse over a track name to display
further information about the track. You can add or remove
tracks from the detail display by checking or unchecking the
required tracks and clicking the “Update Image” button.
You can also upload your own annotation data in a special
format to GBrowse using the “Add your own tracks” feature. For instructions on file format and uploading click on
the “Help” link in this section.
4. To download the sequence in a particular region, go to the
“Reports and Analysis” feature box and select “Download
Decorated FASTA File” from the menu options. This file
format allows you to highlight specific features of interest
(e.g., coding regions in red) on the FASTA sequence file. Use
the “Configure” option to select which features to highlight
and the desired markup options such as font styles and background colors and then click “GO.” The new web page will
display the FASTA sequence for the region displayed in the
detail view with the selected features highlighted.
3.2.3 Finding Related
DNA or Protein Sequences
in Arabidopsis
For sequenced genes with limited experimental data, one of the
first steps toward understanding a gene’s function is to search for
evolutionarily related genes. The function of an unknown gene
may be inferred from its similarity to a well-characterized homolog. Searching for similar DNA or protein sequences in Arabidopsis
using local sequence alignment methods can be performed at
TAIR and NCBI. These groups share some overlapping Arabidopsis
datasets; but TAIR has some Arabidopsis-specific datasets not
found at NCBI (http://www.arabidopsis.org/help/helppages/
BLAST_help.jsp#datasets). These datasets are used by all of TAIR’s
sequence similarity programs (WU-BLAST, NCBI BLAST,
FASTA, PatMatch) [15]. This section illustrates how to use TAIR’s
WU-BLAST tool to identify similar genes in Arabidopsis.
1. On any TAIR page with a top navigation bar, select WU-BLAST
from the Tools drop-down menu (http://www.arabidopsis.
org/wublast/index2.jsp).
2. Select the appropriate BLAST program. Five different algorithms are available to match amino acid or nucleotide sequences.
The choice of the program depends on the type of sequence
Database and Stock Resources
79
to be queried and the query database. For example, when
comparing a protein sequence to other protein sequences,
choose the BLASTP program.
3. Input your query. The tool accepts sequences or locus identifiers
as inputs. To use a sequence as input, paste in the sequence as
raw text or in FASTA format, or upload it from a file. Sequences
pasted directly from GenBank records can also be used. To use
a locus identifier as input, choose the locus name option under
the input header, and type in the name of the locus, or upload
it from a file. When using locus identifiers as input, the program retrieves the coding sequence (CDS) for the representative
gene model; therefore, it cannot be used with the BLASTP or
TBLASTN options. To perform a search using more than one
query sequence, submit multiple sequences as a list of locus
identifiers or as a set of FASTA formatted sequences, each
sequence having its own FASTA header.
4. Define the dataset to search against. For example, to find homologous proteins in Arabidopsis choose the AGI protein dataset.
This dataset is a non-redundant set of all known Arabidopsis
proteins and includes all proteins generated through alternative splicing.
5. Customize the BLAST search parameters. The default parameters are filtering on an expect threshold (cutoff) of 10.
The default S value is calculated based on the E value and
represents the single high-scoring pair (HSP) score that satisfies the expected threshold.
6. Submit the query. Click on the “Run BLAST” button. If you
have chosen an inappropriate combination of query sequence
and database, an error will be returned to your browser. Results
from the WU-BLAST search are presented in a graphical format that can be used to rapidly assess the significance of the
results. The graph displays the query sequence in red and the
HSP matches below. The length of the bar corresponds to the
length of the HSP, and the color of the bar indicates the range
of expected values (the probability of finding the sequence
match by random chance). The direction of the bar indicates
whether the match is on the forward or reverse strand. Pointing
the mouse over the HSP markers will display the description
line of the matched sequence. Clicking on the HSP will display the selected sequence alignment. For AGI genes and loci,
the name in the alignment is hyperlinked to the TAIR locus
detail page.
3.2.4 Finding Protein
Structure and Domain
Information
The function of an unknown gene may also be inferred from the
presence of conserved domains. For example, proteins with an
F-box domain (IPR001810, http://www.ebi.ac.uk/interpro/
entry/IPR001810),) typically form part of an SCF E3 ubiquitin
80
Donghui Li et al.
ligase, whereas proteins with a kinesin motor domain (IPR001752,
http://www.ebi.ac.uk/interpro/entry/IPR001752) may be
involved in intracellular transport in association with the cytoskeleton. Additional sequence-based features, such as transmembrane
domains or a KDEL endoplasmic reticulum retention signal, can
be used to infer protein localization. Protein structural data including predicted domain can be found at various databases such as
TAIR, NCBI, and InterPro. Here, we describe how to use TAIR’s
Bulk Protein Download tool to obtain a list of structural, physical,
and chemical properties for a set of proteins.
1. On any TAIR page with a top navigation bar, select “Bulk Data
Retrieval” from the Tools drop-down menu. Then select
“Proteins” (http://www.arabidopsis.org/tools/bulk/protein/
index.jsp).
2. Choose the output display. The output options include molecular weight, isoelectric point, intracellular locations, domains,
number of transmembrane domains, UniProt ID, and SCOP’s
structural class. Selecting the HTML format option will display links to TAIR locus detail pages, protein sequences,
SeqViewer graphical displays, and protein records in UniProt/
Swiss-Prot, and InterPro. The last two links are shown only if
domains and Swiss-Prot IDs are included in the output.
Choose “text” output if you wish to download the data into
your computer. Queries that return more than 1,000 results
will be returned as text-only format.
3. Limit the search by protein properties. For example, to obtain
a list of proteins with a given range of molecular weights, check
the box next to “Predicted Molecular Weight” and enter the
lower and upper limits of the desired weight range in the
adjacent text boxes.
4. Submit the query by clicking on the “Get Protein Data”
button.
Protein domain annotations may not be consistent from database to database because different analysis methods or sequences
are used. Domain databases are also updated frequently as new
domain structures are identified. Frequent checks of genome databases should be done to determine whether new domains have
been identified.
3.3 Finding Gene
Ontology (GO)
Annotations
To make data about a gene’s function more amenable to computational methods of querying and analysis, many databases use
structured controlled vocabularies for annotating gene products.
The Gene Ontology vocabularies developed by the Gene Ontology
Consortium (http://www.geneontology.org) have been widely
adopted by many biological databases and are considered to be the
standard for gene function annotation. GO describes three aspects
Database and Stock Resources
81
of a gene product: molecular function, biological process, and
cellular component (subcellular localization) [17]. TAIR is the primary source of GO annotations for Arabidopsis genes. Additional
sources of Arabidopsis GO annotations include TIGR (The
Institute for Genomic Research) (see Note 4), UniProtKB-GOA
(UniProt Knowledge Base Gene Ontology Annotation group) and
the GO Consortium [19]. Members of the research community
also contribute GO annotations through TAIR’s journal collaboration program and through voluntary user submissions [20].
Annotations from all the above sources are displayed in TAIR.
Users can also access these annotations from the central GO database using the AmiGO query tool for making cross species queries
(http://amigo.geneontology.org/). This section describes how to
find GO annotations at TAIR (see Note 5 for information about
how to correctly interpret them).
3.3.1 Finding GO
Annotations
Users can view GO annotations for a single gene from its locus
detail page and can also download TAIR’s whole-genome GO
annotation file from its ftp site (ftp://ftp.arabidopsis.org/home/
tair/Ontologies/Gene_Ontology/). This file is updated on a
weekly basis. To retrieve GO annotations for a specific gene or set
of genes, use TAIR’s Gene Ontology Annotations Search tool.
1. On any TAIR page with a top navigation bar, select “Gene
Ontology Annotations” from the Search drop-down menu
(http://www.arabidopsis.org/tools/bulk/go/index.jsp).
2. Input the locus identifier(s) in the query box. Type, paste, or
upload a file containing your list of locus identifiers.
3. Define output options. Select HTML to view hyperlinked
results. Choose text for saving the results as a text file.
4. To obtain a list of annotations, click on the “Get all GO
Annotations” button at the bottom of the page.
5. Alternatively, instead of getting a list of all annotations, the
genes can be grouped into broader categories based on their
annotations. After inputting the locus identifiers (step 2 above),
choose “HTML” output and click the Functional Categorization
button. The functional categorization table data can be transformed into a pie chart by clicking on the “Draw Annotation
Pie Chart” button at the top of the results page. Further details
on functional categorization based on GO annotations are
described in Chapter 5.
3.3.2 Finding Genes
Annotated to Related
Functions or Processes
By using structured controlled vocabularies, GO annotations
allow researchers to quickly find what genes may act in a pathway
(genes annotated to the same biological process term) or have
similar function (genes annotated to the same molecular function term). For example, ERA1 (AT5G40280) encodes a protein
82
Donghui Li et al.
farnesyltransferase; mutants have low prenylation levels and defects
in meristem organization and abscisic acid-mediated responses [8,
21–23]. A researcher may want to know the following: What other
genes might be involved in prenylation, and do they act in the
same or another pathway?
1. On any TAIR page with a top navigation bar, select “Keywords”
from drop-down menu under Search (http://www.arabidopsis.org/servlets/Search?action=new_search&type=keyword).
2. Enter the term (keyword) “farnesyltransferase” in the text box
and choose “contains” (an inexact search) from the drop-down
menu to the left of the text box (see Note 6). Restrict the keyword category to “GO Molecular Function” and click the
“Submit Query” button.
3. The Keyword Search Results page displays terms retrieved
along with a count of data objects (loci, publications, annotations) annotated to that term and to its child terms. Click
“loci” to display the genes annotated to “farnesyltranstransferase activity” and its child terms (e.g., farnesyl-diphosphate
farnesyltransferase activity). Click on the “Download All” button
on the result page to save the list of all loci.
4. On the Keyword Search Results page, click on “treeview” to
view the term in a hierarchical tree view. Click on the plus sign
next to a term to expand the node and display all of the child
terms. To display genes annotated to each of the parent and
child terms, select the “loci” radio button at the top of the tree
view page and then click on the Display button. The display
will show a count of the number of loci annotated to each term
and the number of loci annotated to the children of each term.
Click on the numbers to view more details.
The above example used a GO molecular function term “farnesyltranstransferase activity” to show how to find genes sharing
similar function by searching for genes annotated to the same function term. Similarly searching for genes annotated to a process
term (e.g., protein prenylation) will retrieve a list of genes involved
in the related process.
3.4 Finding
Information About
Gene Expression
An important method of finding functional information comes
from the analysis of gene expression data (see Note 7). There are
many reasons to analyze these data, such as finding the expression
pattern of a gene in an organism, determining the effect of the
environment on the expression of particular genes, or understanding how the expression of one gene affects the expression of other
genes. A number of methods have been applied to study gene
expression in Arabidopsis including low-throughput methods such
as Northern blot, reverse transcription-polymerase chain reaction
Database and Stock Resources
83
(RT-PCR), in situ hybridization, and various reporter assays
(e.g., GFP or green fluorescence protein, GUS or ß-glucuronidase
reporters) and high-throughput methods such as DNA microarray
analysis or RNA-Seq. Expression data obtained by the use of
low-throughput methods can be found mainly in the literature.
Some of these data in published literature have been captured in
the form of Plant Ontology (PO) annotations through TAIR’s literature curation effort. High-throughput DNA microarray data
are for the most part stored in databases allowing for download
and further analysis. Some of the DNA microarray expression data
have also been converted into PO annotations and can be found in
TAIR. For example, TAIR contains close to half million PO annotations based on the AtGenExpress microarray data (http://www.
weigelworld.org/resources/microarray/AtGenExpress/) [24].
3.4.1 Finding an
Expression Pattern by
Searching for Plant
Ontology Annotations
Following the model of Gene Ontology, the Plant Ontology
Consortium (POC; http://www.plantontology.org/) has developed an ontology of controlled vocabulary terms for plant structure as well as growth and developmental stages [18]. Examples of
plant structure terms are leaf, leaf stomatal complex, phyllome vascular system, etc. Examples of growth and developmental stages
terms are seedling shoot emergence stage, late rosette growth, etc.
In TAIR, PO are used to annotate gene expression data from lowthroughput experiments such as Northern blot and reporter assays
as well as high-throughput data from DNA microarray experiments
and proteomics studies. PO annotations are displayed on the locus
detail page along with GO annotations in the Annotations section.
To retrieve PO annotations for a set of genes, use the Plant
Ontology Annotations Search tool accessible from the Search
drop-down menu (http://www.arabidopsis.org/tools/bulk/po/
index.jsp). To find genes co-expressed in the same tissue or developmental stage, use the Keyword Search tool described previously
(Subheading 3.3.2) by simply replacing a GO term with a PO
term. TAIR’s whole-genome PO annotation file is available for
download from its ftp site (ftp://ftp.arabidopsis.org/home/tair/
Ontologies/Plant_Ontology/).
3.4.2 Finding DNA
Microarray Data
DNA microarrays are one of the most powerful tools for
investigating the expression pattern of thousands of genes in parallel, and microarray experiments are now commonly performed in
many Arabidopsis laboratories. A vast amount of DNA microarray
data has been generated, either through coordinated community
effort such as the AtGenExpress project (http://www.arabidopsis.
org/portals/expression/microarray/ATGenExpress.jsp) or as a
result of individual research projects carried out by numerous laboratories. Arabidopsis microarray data can be found in several public
repositories. TAIR provides access to experimental results from
84
Donghui Li et al.
both cDNA- and Affymetrix-based platforms of microarray data
that TAIR received before June 2005. Newer and more comprehensive data can be found in NASCArrays (http://affy.arabidop[16],
ArrayExpress
sis.info/narrays/experimentbrowse.pl)
(http://www.ebi.ac.uk/arrayexpress/) [25], and GEO (http://
www.ncbi.nlm.nih.gov/geo/) [26]. The emphasis of these public
databases with microarray data is to provide long-term storage and
access to publicly available data. There are many other academic
and commercial groups that have focused on developing advanced
analysis tools for mining microarray datasets. Notable examples
include Genevestigator (https://www.genevestigator.com) [27]
and eFP Browser (http://bar.utoronto.ca/efp/cgi-bin/efpWeb.
cgi) [28]. These tools will be covered in Chapter 5. This section
shows how to use the TAIR microarray database to find expression
profiles of a gene or genes in specific experiments.
1. Start at the TAIR Microarray Expression Search (http://www.
arabidopsis.org/servlets/Search?action=new_search&type=
expression). This search can be used to find expression data for
up to 100 genes using gene names, locus identifiers, microarray element names, or GenBank accession numbers.
2. Choose the default “Locus” from the Search by Name or
GenBank Accession drop-down menu and enter At5g01810 in
the query text box.
3. Choose Array Type/Design. This feature allows the search to
be restricted to a specific type of arrays. Choose the default
option (Affymetrix GeneChips, any design).
4. Limit Search by Expression Values (optional). This option allows
one to adjust expression value parameters for either Affymetrix or
cDNA arrays. Since this example involves Affymetrix data, use
the parameter selections for this type of array. In the Detection
section, choose Present, which will only include data from
hybridizations where the transcript was detected.
5. Limit search by Experiment Parameters (optional). This is an
advanced option to restrict a search to only certain experiments. If no limits are imposed, all the experiments in the database are searched.
6. Select the output options (optional).
7. Submit the query. The summarized results include array name
(locus identifier), information about the experiment design
(Experiment Name, Sample Variables), and specific data for
each experiment. Click on the links to go to respective detail
pages. The results can be downloaded in text format by clicking
the check boxes for the records of interest and then clicking
Download Checked.
Database and Stock Resources
85
3.5 Obtaining
Information About
Metabolism in
Arabidopsis
There are a number of different databases that focus on providing
information related to metabolism and metabolites in Arabidopsis
including AraCyc and PlantCyc from the Plant Metabolic Network
(PMN) [29], Arabidopsis Reactome [30], KEGG [31], KaPPAView4 [32], MetNet [33], MetaCrop [34], and KNApSAck [35].
Although these resources may each offer specific benefits and their
combined use might be ideal for optimal data analysis, based on
the historical and ongoing connection between TAIR and Pathway
Tools/PMN databases, this section will only describe how to access
information from PMN databases with a focus on PlantCyc and
AraCyc. PlantCyc can house biochemical data for all plant species,
whereas AraCyc serves as a metabolic encyclopedia of Arabidopsis
[29, 36–38]. Both databases provide information about genes,
enzymes, compounds, reactions, and pathways that can have
experimental and/or computational support. Semiannual releases,
including the latest in July 2013, continue to improve upon the
depth, breadth, accuracy, and coverage of these resources. In many
cases, links are provided to connect these items found in the PMN
to outside metabolism resources such as KEGG, BRENDA,
ChEBI, and PubChem, as well as to more general databases such
as TAIR, Phytozome, and UniProt.
3.5.1 Finding Information
About Metabolic Pathways
by Name
Although plant metabolism can only be completely described
through an extremely dense and highly interconnected metabolic
web, many scientists want to search for “pathways” that describe a
comprehensible subset of connected reactions. These can be found
in AraCyc from TAIR or in AraCyc or PlantCyc directly through
the PMN.
1. From any TAIR page, enter the common name of a pathway
(e.g., chlorophyll biosynthesis) or a prominent compound
expected to be in the pathway (e.g., ascorbate) in the Quick
Search tool in the header. Select “Metabolic Pathways” from
the drop-down menu of search types and click the “Search”
button. The search by default is a “contains” search, so, on the
results page, all pathways, enzymes, reactions, and compounds
associated with the input keyword will be retrieved. In the case
of ascorbate, four different pathways are retrieved.
2. The same search can be performed from within the Plant
Metabolic Network (www.plantcyc.org). From any PMN page,
enter the search term (e.g., “ascorbate”) in the Quick Search bar
in the header, select the database to query, and click on “Quick
Search” or “Search” (see Note 8). Again the default search will
return all entries in the database that “contain” the term including
pathways, enzymes, compounds, and/or reactions.
3. To learn more about a specific pathway, click on its name in the
search results. This opens a page that provides a diagrammatic
86
Donghui Li et al.
representation of the pathway, evidence code(s), taxonomic
information, a curator-written summary, literature references,
and more. When a pathway page is initially opened, an overview diagram that lacks detailed information about enzyme
identities, chemical structure, etc., may be shown. Click on the
“More Detail” button one or more times to display the pathway with increasing amounts of information. When enzyme
names appear (in gold), they are shown in bold if they are supported by experimental evidence or non-bold face type if they
are supported by computational predictions. Each item on the
pathway can be clicked on to open another page with more
information, such as an “enzyme detail page.”
3.5.2 Finding Information
About Metabolic Pathways
Based on Pathway
Properties
To find a specific pathway or group of pathways that cannot be
identified solely by name, at least four additional search strategies
are available.
1. The Pathway Search page enables a user to select one or more
pathways based on a variety of criteria. To access it from any
page, expand the “Search” drop-down menu and choose
“Pathways”
(http://pmn.plantcyc.org/pwy-search.shtml).
On the resulting page, nine different gray headers describe the
type of filtering criteria available. To use one or more types of
filter, click on the small box, e.g., to the left of the text that says
“Search/Filter by number of reactions,” and enter the desired
restrictions. Multiple criteria can be combined before clicking
on the “Submit Query” button.
2. The Advanced Search page gives users even more power to
generate detailed requests. To access it from any page, expand
the “Search” drop-down menu and choose “Advanced Query”
(http://pmn.plantcyc.org/query.shtml). Several steps must
be taken to construct a query in Section 1 of the page, beginning with choosing the appropriate database to search. Multiple
“conditions” may be included in the search using the “add a
condition” button and may be connected through Boolean
operators. Once the request has been formulated, select the
columns of data to output and choose a column to sort by in
Section 2 of the page. Specify the output format (html or tab
delimited) in Section 3 and then click on “Submit Query” to
initiate the search. It should be noted that a familiarity with the
underlying structure of the Pathway Tools database facilitates
the use of this search tool.
3. Pathways can also be identified based on their membership
in a particular class, such as “Amino Acids Biosynthesis” by
using the Pathway Ontology Browser. To access it from any
page, expand the “Search” drop-down menu and choose
“Browse Ontologies” and then “Pathway Ontology.” In the
Database and Stock Resources
87
resulting page, navigate through the ontology by clicking on any
plus sign to expand a category and any minus sign to contract it.
4. Experimental data can also be used to highlight specific pathways that may be of interest to a user. Briefly, quantitative or
qualitative results from transcriptomic, proteomic, and metabolomic experiments can be projected onto the entire Arabidopsis
metabolic map using the “Metabolic Map/Omics Viewer”
present under the “Tools” menu. A tutorial for this procedure
is available at the PMN and has been described in previous
publications [15].
3.6 Finding and
Ordering Seed
Resources from the
Arabidopsis Biological
Resource Center
The ABRC provides access to thousands of seed stocks which can
be identified through a number of different search strategies at
TAIR. Queries can be entered into the quick search bar in the
header, or using the advanced Seed/Germplasm search, located on
the ABRC Stocks drop-down menu on the TAIR navigation bar.
The quick search allows searching by germplasm or polymorphism
name or seed stock number. In addition to this, the advanced
Seed/Germplasm search allows searching by germplasm/seed
stock-associated information such as donor name, gene name,
allele name, and phenotype. Searches can be limited by species, by
germplasm type, and by a range of other attributes including
genetic background, mutagen, and genotype. A specific search for
ecotypes allows searching for natural variants of A. thaliana and
related species by donor or germplasm attributes. The search can be
limited by location and habitat. Search results pages for both germplasm and ecotype searches include check boxes for ordering and
links to detail pages. Detail pages also contain links to other relevant
information, for example, to clone detail pages for transgenic
germplasm and to community detail pages for donors.
Stock-browsing functions are also supported by ABRC’s catalog
that can be accessed from the ABRC Stocks drop-down menu in
the navigation bar available on most TAIR pages. Seed stocks in
the catalog are divided into eight categories and include a range of
different types of mutants, mapping lines, transgenic and RNAi
lines, natural accessions, and seeds from other closely related species. Some sections link to detail pages with check boxes for ordering. Other sections link to summary pages describing available
resources in that category with tips for finding them through
advanced searches.
Arabidopsis seed stocks with associated sequence information,
such as flank sequenced insertion lines, can be found by searching
using the AGI locus identifiers through TAIR’s GBrowse genome
viewer and are fully integrated in the TAIR database. GBrowse is
accessible from the navigation bar under “Tools.” Locus-associated
polymorphisms are displayed on the T-DNAs/Transposons and
Polymorphisms tracks. Clicking on a polymorphism on the viewer
links out to the polymorphism detail page where the corresponding
88
Donghui Li et al.
germplasm/stock can be found and ordered. Germplasm names/
stock numbers are also displayed on locus detail pages with check
boxes for ordering. Stock numbers and germplasm names link to
germplasm detail pages where specific information and an ordering
button are displayed.
Individuals can access their own order history and invoices
from their personal home page when logged in to the TAIR web
site. Other TAIR users cannot access an individual’s complete order
history, but the order history for a specific stock can be accessed
through a link on the germplasm detail page for that stock.
In addition to TAIR’s Seed/Germplasm Search, the T-DNA
Express (http://signal.salk.edu/cgi-bin/tdnaexpress) developed by
the Salk Institute Genomic Analysis Laboratory (SIGnAL) is another
popular tool that helps users to find mutant resources associated
with specific loci or chromosome locations [3]. T-DNA Express
provides links to directly connect users to the ABRC, NASC, or
other appropriate stock center to order them. In a reciprocal manner,
TAIR provides direct links to this tool from the External Links
section of its Locus Detail page (see Subheading 3.1.2).
3.7 Finding and
Ordering Other
(Non-seed) Resources
from the Arabidopsis
Biological Resource
Center
Arabidopsis clone information is fully integrated into the TAIR
database. For sequenced clones, links to clone detail pages can be
accessed from TAIR’s GBrowse genome viewer and from Locus
detail pages. Clone detail pages contain a link to a stock detail page
where information such as price, special handling, and other stock
specific data can be found.
Clones and all other non-seed stocks can also be found through
the TAIR quick search, but it is necessary to provide some name
information, such as stock number or clone name. Advanced search
options for these resources are provided by the TAIR DNA search
( http://www.arabidopsis.org/ser vlets/Search?action=new_
search&type=dna). Drop-down menus allow selection of the type
of resource sought (e.g., vector, clone, or host strain), the species,
and the type of information supplied (e.g., name, AGI, or stock
number). A wide range of features to restrict the search are also
available. Results pages from the search provide check boxes for
ordering stocks, links to clone, vector and/or stock detail pages, as
well as links out to NCBI for sequence information if available.
Detail pages provide check boxes for ordering and links out to publications, images, and external web pages with information relevant
to the stocks. The order history for a specific stock can be accessed
through a link on the stock detail page.
DNA stocks can be found by browsing the ABRC catalog.
They are divided into five categories, including libraries, clones,
vectors, and host strains. The catalog provides links to detail pages
with check boxes for ordering or to summary pages describing
available resources in that category with tips for finding them
through advanced searches. Access to other non-seed resources,
Database and Stock Resources
89
including protein chips, cell cultures, and educational resources, is
also provided by the catalog. More details about educational
resources developed by the ABRC can be obtained from the ABRC
outreach portal at http://abrcoutreach.osu.edu.
3.8 Searching
Literature Databases
Researchers have published a wealth of data about all aspects of
Arabidopsis physiology, biochemistry, and development. Databases
such as PubMed, Agricola, and BIOSIS index articles from a wide
variety of journals and can be used to find citations and articles in
electronic or print format.
The National Center for Biotechnology Information (NCBI’s)
PubMed (http://www.ncbi.nlm.nih.gov/pubmed/) is the primary
database for life-science literature. At the end of 2011 the number
of Arabidopsis publications in PubMed totaled over 36,000.
PubMed has a powerful search interface and links to the rest of the
databases within the NCBI system, such as sequence and expression
databases. PubMed records are linked to publishers’ sites for access
to the full text of the article. For help using the resource refer to
the PubMed tutorial (http://www.nlm.nih.gov/bsd/disted/
pubmed.html).
TAIR compiles bibliographic records about Arabidopsis from
PubMed, BIOSIS, and Agricola. In addition, TAIR includes publications not found in these databases, such as abstracts from the
International Conference on Arabidopsis Research, defunct
Arabidopsis electronic journals (The Arabidopsis Information
Service and Weeds World), books, and dissertations. The following
sections describe how to find Arabidopsis publications in PubMed
and TAIR.
3.8.1 Finding Articles in
the NCBI PubMed
Database
1. Start at the PubMed search page (http://www.ncbi.nlm.nih.
gov/pubmed/).
2. Enter the desired term(s) in the text input box. Searches can be
restricted using the Boolean operators AND, OR, and NOT to
combine terms. To search for a phrase, it must be enclosed in
quotes (e.g., “transcription regulation”) or with a special flag
“[tw]” (e.g., “transcription factor [tw]”). Use wild-card characters (*) for inexact matching. For example, to find all the
articles about all the Agamous-like genes, type in “AGL*.” For
more refined searching, use the advanced search page (http://
www.ncbi.nlm.nih.gov/pubmed/advanced). The Search
Builder allows users to build complex queries.
3. Finding the article text and saving relevant citations: The default
display format is a summary of the citation. The complete citation, including available abstracts, can be viewed by clicking on
the titles. Articles that are available online are linked to the
publisher’s web sites, which may be freely accessible or require
a subscription. To modify the display of results, select the
90
Donghui Li et al.
appropriate option from the display menu. For example, to
import a citation into reference management software, choose
MEDLINE format. References can be saved into a file for
downloading or sent to an e-mail address. After selecting the
articles by clicking on the checkboxes alongside the citations,
choose the desired option under the “Send to” menu and click
on the “Send to” button.
3.8.2 Finding Arabidopsis
Publications Using TAIR’s
Publication Search
1. On any TAIR page with a top navigation bar select “Publication”
from drop-down menu under Search (http://arabidopsis.
org/servlets/Search?action=new_search&type=publication).
2. To search with a specific author’s name or phrase, enter the
desired terms in the text query boxes and choose the field to
search from the drop-down menu (abstract, author, journal/
book title, title, title/abstract, URL for electronic publications, journal, or PubMed ID). For example, to search for all
publications about oxidative stress, type the phrase into the
text box and select “Title/Abstract” in the drop-down menu.
Unlike the PubMed search, quotes are not required; all text in
a single box is treated as a phrase. To restrict the search by
publication dates or publication type, fill in the corresponding
boxes.
3. Click on the “submit” button to start the search. The results
are displayed in a summary format including the title, journal,
authors, and year. The title is hyperlinked to a page containing
the complete citation, links to authors’ TAIR profiles, the
abstract, if available, and a list of associated keywords and
genes. For articles with a PubMed ID, a link to the PubMed
database is also provided.
3.8.3 Searching Full-Text
Arabidopsis Literature
Using Textpresso
Textpresso is an information extracting and processing package for
biological literature [39]. Textpresso for Arabidopsis allows users
to search over 40,000 abstracts and 27,000 full-text publications in
TAIR as of August 2011.
1. To use this tool, go to http://www.arabidopsis.org/ and
select “Textpresso Full Text” from Tools drop-down list.
Alternatively go to http://www.textpresso.org/arabidopsis/.
2. Enter the search term in the Keywords query box. Textpresso
is extremely useful for tracking down specific information like
the mutation sites in certain alleles. For example, enter
SALK_099519, and click “Search.” Sentences that contain the
matching keyword are displayed together with bibliographic
information so that users can quickly confirm the usefulness
of a particular paper and link directly to the full text, if they
have an appropriate subscription to the journal in question.
At the Textpresso site, searches can be narrowed by searching
in specific keyword categories (mouse over “List >”) including
Database and Stock Resources
91
Arabidopsis gene names, Gene Ontology and Plant Ontology
(terms), or a combination of keywords. Advanced search
options are described in the User Guide accessible from the
top navigation bar.
3.9 Submitting Your
Data or DNA/Seed
Stocks
Funding agencies such as the National Science Foundation (NSF)
have invested heavily in the development of community resources
such as biological databases and stock centers. These resources play
a crucial role in driving research forward by providing access to
data and research materials. The long-term sustainability of such
resources depends upon contributions by the research community.
In an age when data influx has outstripped the organizational ability of the staff of any one database, it is essential to involve the
research community in the data collection and curation process.
It is important that researchers share their findings not only
through publication but also by contributing their data directly to
scientific databases. This section describes how to submit your data
and/or DNA/seed stocks to various databases.
3.9.1 Submitting Data
to TAIR
TAIR accepts a wide range of data types including gene function,
structure, interaction partners, expression patterns, markers, phenotypes, and several others. Instructions for data submission are
available on the Submit Overview page (http://www.arabidopsis.
org/submit/index.jsp), accessible from the Submit drop-down
menu in the top navigation bar.
TAIR provides several ways for researchers to submit their
data. For gene function data submission, the use of the online
submission tool (http://www.arabidopsis.org/doc/submit/functional_annotation/123) is encouraged. This tool requires the submitting user to log into the TAIR system with a registered user ID,
which provides an automatic provenance for the submitted annotations. Reference information (PubMed ID or DOI identifier) is
also required. The use of DOIs allows a user to submit annotations
before public release of the manuscript; however, the annotations
are only released from TAIR upon publication of the corresponding article.
Users can also prepare various types of data for submission
formatted according to the guidelines listed on the Submission
Overview page or download and use the preformatted Excel
spreadsheets available there [15]. Data can then be submitted to
TAIR by e-mail to curator@arabidopsis.org. In addition, each data
detail page contains a Comments section; registered TAIR users
can submit comments by clicking on the “Add My Comment”
button. Comments submitted are immediately displayed in the
Comments section of the detail page.
For corrections to existing data, users may contact TAIR by
e-mail to curator@arabidopsis.org.
92
Donghui Li et al.
3.9.2 Submitting
Data to the PMN
The Plant Metabolic Network is eager to receive data submissions
of published findings related to pathways, enzymes, reactions, or
compounds found in plants. To help researchers submit these data
types, three Excel forms and simple instructions are provided on
the Data Submission page (http://www.plantcyc.org/feedback/
data_submission.faces). This can be accessed from the “Submit
Data” heading on the menu bar. Submitters are encouraged to
enter the data on the forms, save them locally, and then send them
to the PMN. The forms may be e-mailed or may be uploaded and
submitted via the Feedback Form (http://www.plantcyc.org/feedback/feedback_form.faces) that can also be found on the “Submit
Data” menu. Although thoroughness on the forms is appreciated,
incomplete forms are always accepted. In addition, supporting materials, such as .gif files that depict pathway layouts or .mol files that
provide compound structures, can also be submitted. The PMN also
welcomes experts to volunteer to help review particular domains of
metabolism to check for completeness and accuracy.
Feedback and corrections concerning data found in the PMN
can be submitted using the Feedback Form or through a direct
e-mail to curator@plantcyc.org.
3.9.3 Donating Seed and
DNA Stocks to the ABRC
The ABRC accepts all Arabidopsis seed resources and is particularly
interested in receiving confirmed insertion mutants, characterized
mutants, transgenic lines, and cDNA/ORF clones. For other types
of resources, it is necessary to contact the stock center in advance
to ensure that the resource can be accommodated. All seed
resources are shared with NASC after propagation at the ABRC or
immediately if enough seed is supplied. Other resources may also
be shared with NASC if requested by NASC customers. The ABRC
has developed stock donation forms to collect data associated with
stock donations. This data is curated by ABRC staff and uploaded
to TAIR within a month of receiving the material. Donated stocks
are being made available for distribution either at the time related
data is uploaded or upon amplification. Although it is preferable
that donors fill out ABRC donation forms, a simple donation form
is available for published resources and data in other formats is
accepted, particularly for large collections of stocks. Links for
downloading ABRC donation forms are available from the ABRC
Stocks drop-down menu. A donation form for a contribution of
educational materials for high school and undergraduate-level
classes has recently been developed and is available upon request.
4
Notes
1. Classical mutants are mostly characterized and published
mutants derived from forward genetic screens utilizing
populations generated with various mutagens (X-rays, fast
Database and Stock Resources
93
neutrons, ethyl methanesulfonate or EMS, agrobacterium
transformation, etc.).
2. The quick search performs a name search for most of the
objects in the TAIR database (e.g., Genes, Clones, ESTs or
BAC ends, People/Labs, Polymorphisms/Alleles, Germplasms,
Ecotypes, Keywords, Genetic Markers, Proteins, Seed and
DNA Stocks, and Vectors). By default, this is a “contains”
search (a search for aba1 retrieves both ABA1 and ATRABA1A).
This search is not limited to the name field. For example, when
performing a quick search for “Gene,” the gene description
and keywords fields will be searched as well as the name.
This is to avoid missing any potentially relevant results, but
sometimes too many results are returned. To perform an exact
name search, choose the “exact name search” option from the
drop-down menu to the right of the search box. This option
will only search the name field for all the data types listed in the
drop-down menu [15].
3. The computational description contains the gene’s full name,
Gene Ontology and Plant Ontology terms, best BLASTidentified A. thaliana protein match, and the number of
protein BLAST hits in other species (NCBI BLink) [15]. A
computational description is only shown if the locus has not
yet been curated manually. Users are especially welcome to
submit suggested gene descriptions for loci that only have a
computational description.
4. TIGR, now the J. Craig Venter Institute (http://www.jcvi.
org/), no longer actively produces GO annotations for
Arabidopsis genes, but past TIGR annotations are still stored
in TAIR.
5. GO annotations can be divided into two broad categories: (1)
annotations based on experimental data including results from
low- and high-throughput experiments (e.g., DNA microarray
and proteomics studies) and (2) computationally predicted
annotations. Computational annotations are based on an in
silico analysis of the gene product sequence and/or other data
as described in the cited reference and may or may not be individually reviewed by a curator. For example, TAIR uses a combination of InterProScan and InterPro2GO mapping file to
create GO annotations for proteins based on the presence of
domains with mapped GO terms [8]. Such annotations are not
reviewed on an individual basis by a curator. Alternatively,
annotations can be made by a curator on an individual basis by
examining relevant computational analyses (e.g., sequence
alignment, protein family information). Computational
annotations provide the basis to form testable hypothesis
particularly for genes with little known experimental data.
For example, AT3G24560 (RASPBERRY 3) is annotated to
94
Donghui Li et al.
the GO term “ligase activity, forming carbon–nitrogen bonds”
based on an InterPro domain scan. A researcher can then
design an experiment to test whether indeed this protein has
ligase activity. The GO Consortium has developed a set of evidence codes to indicate how an annotation to a particular term
is supported. In order to correctly interpret a GO annotation,
it is essential to review the evidence code together with the GO
term. For a complete list of evidence codes currently in use, go to
http://www.geneontology.org/GO.evidence.shtml. In TAIR,
annotations also include an evidence description. For example,
an annotation with the evidence code “inferred from mutant
phenotype” (IMP) may be further specified by including an
evidence description “RNAi experiments.” Since more than
one gene may be affected by RNA interference, the GO
annotation should be viewed with the understanding that the
phenotype may be due to the loss of function of more than one
homologous locus. An in-depth discussion on how to avoid
the common misuse of GO is available [40].
6. Many of the GO terms exist as complex phrases. TAIR searches
treat the entire entered term or phrase as a complete phrase
rather than a set of words. Consequently, an “exact match”
search will often not retrieve any entries. Therefore, using the
“contains” option for keyword searches is recommended [15].
7. Gene expression data historically and most properly refers to
the expression of gene transcripts; however, the expression of
protein constructs and/or the analysis of proteomic experiments is also often grouped into this category.
8. The PMN offers a collection of PMN-generated pages (www.
plantcyc.org/…) and Pathway Tools-generated pages (pmn.
plantcyc.org/…) which have some differences, particularly in
the header. Most notably, a simple drop-down menu is used to
select a database to query via the Quick Search bar on PMNgenerated pages, whereas the “change organism database” link
can be used to select a new database to query on all Pathway
Tools-generated pages.
Acknowledgements
This project was supported by the National Science Foundation
(grant number DBI-0850219, DBI-0640769, IOS-1026003),
the National Institute of Health National Human Genome
Research
Institute
(NIH-NHGRI)
(grant
number
5P41HG002273-09), and the TAIR sponsorship program
(http://www.arabidopsis.org/doc/about/tair_sponsors/413).
Database and Stock Resources
95
References
1. Arabidopsis Genome Initiative (2000) Analysis
of the genome sequence of the flowering plant
Arabidopsis thaliana. Nature 408:796–815
2. The Multinational Arabidopsis Steering
Committee (2011) The multinational coordinated Arabidopsis thaliana functional genomics project annual report 2011. http://www.
arabidopsis.org/portals/masc/2011_MASC_
Report.pdf
3. Alonso JM, Stepanova AN, Leisse TJ et al
(2003) Genome-wide insertional mutagenesis
of Arabidopsis thaliana. Science 301:653–657
4. Garcia-Hernandez M, Berardini TZ, Chen G
et al (2002) TAIR: a resource for integrated
Arabidopsis data. Funct Integr Genomics
2:239–253
5. Huala E, Dickerman AW, Garcia-Hernandez
M et al (2001) The Arabidopsis Information
Resource (TAIR): a comprehensive database
and web-based information retrieval, analysis,
and visualization system for a model plant.
Nucleic Acids Res 29:102–105
6. Rhee SY, Beavis W, Berardini TZ et al (2003)
The Arabidopsis Information Resource
(TAIR): a model organism database providing
a centralized, curated gateway to Arabidopsis
biology, research materials and community.
Nucleic Acids Res 31:224–228
7. Swarbreck D, Wilks C, Lamesch P et al (2008)
The Arabidopsis Information Resource
(TAIR): gene structure and function annotation. Nucleic Acids Res 36:D1009–D1014
8. Lamesch P, Berardini TZ, Li D et al (2011) The
Arabidopsis Information Resource (TAIR):
improved gene annotation and new tools.
Nucleic Acids Res. doi:10.1093/nar/gkr1090
9. Meinke D, Scholl R (2003) The preservation
of plant genetic resources: experiences with
Arabidopsis. Plant Physiol 133:1046–1050
10. Heazlewood JL, Verboom RE, Tonti-Filippini J
et al (2007) SUBA: the Arabidopsis subcellular
database. Nucleic Acids Res 35:D213–D218
11. Lu Y, Savage LJ, Larson M et al (2011)
Chloroplast 2010: a database for large-scale
phenotypic screening of Arabidopsis mutants.
Plant Physiol 155:1589–1900
12. International
Arabidopsis
Informatics
Consortium (2010) An international bioinformatics infrastructure to underpin the Arabidopsis
community. Plant Cell 22:2530–2536
13. Samson F, Brunaud V, Balzergue S et al (2002)
FLAGdb/FST: a database of mapped flanking
insertion sites (FSTs) of Arabidopsis thaliana
T-DNA transformants. Nucleic Acids Res
30:94–97
14. Kleinboelting N, Huep G, Kloetgen A et al
(2011) GABI-Kat Simple Search: new features
of the Arabidopsis thaliana T-DNA mutant
database. Nucleic Acids Res. doi:10.1093/
nar/gkr1047
15. Lamesch P, Dreher K, Swarbreck D, et al.
(2010) Using the Arabidopsis Information
Resource (TAIR) to find information about
Arabidopsis genes. Curr Protoc Bioinformatics.
Chapter 1:Unit1.11
16. Craigon DJ, James N, Okyere J, Higgins J et al
(2004) A repository for microarray data generated by NASC’s transcriptomics service.
Nucleic Acids Res 32:D575–D577
17. Ashburner M, Ball CA, Blake JA et al (2000)
Gene ontology: tool for the unification of
biology. The Gene Ontology Consortium.
Nat Genet 25:25–29
18. Jaiswal P, Avraham S, Ilic K et al (2005) Plant
ontology (PO): a controlled vocabulary of
plant structures and growth stages. Comp
Funct Genomics 6:388–397
19. Reference Genome Group of the Gene
Ontology Consortium (2009) The Gene
Ontology’s Reference Genome project: a unified framework for functional annotation across
species. PLoS Comput Biol 5:e1000431
20. Ort DR, Grennan AK (2008) Plant physiology
and TAIR partnership. Plant Physiol 146:
1022–1023
21. Cutler S, Ghassemian M, Bonetta D et al
(1996) A protein farnesyl transferase involved
in abscisic acid signal transduction in
Arabidopsis. Science 273:1239–1241
22. Yalovsky S, Kulukian A, Rodriguez-Concepcion
M et al (2000) Functional requirement of plant
farnesyltransferase during development in
Arabidopsis. Plant Cell 12:1267–1278
23. Ziegelhoffer EC, Medrano LJ, Meyerowitz
EM (2000) Cloning of the Arabidopsis
WIGGUM gene identifies a role for farnesylation in meristem development. Proc Natl
Acad Sci USA 97:7633–7638
24. Schmid M, Davison TS, Henz SR et al (2005)
A gene expression map of Arabidopsis thaliana
development. Nat Genet 37:501–506
25. Parkinson H, Sarkans U, Kolesnikov N et al
(2011) ArrayExpress update—an archive of
microarray and high-throughput sequencingbased functional genomics experiments.
Nucleic Acids Res 39:D1002–1004
26. Barrett T, Troup DB, Wilhite SE et al (2011)
NCBI GEO: archive for functional genomics
data sets—10 years on. Nucleic Acids Res
39:D1005–1010
96
Donghui Li et al.
27. Hruz T, Laule O, Szabo G et al (2008)
Genevestigator v3: a reference expression database for the meta-analysis of transcriptomes.
Adv Bioinformatics 2008:420747
28. Winter D, Vinegar B, Nahal H et al (2007)
An “electronic fluorescent pictograph” browser
for exploring and analyzing large-scale biological data sets. PLoS One 2:e718. doi:10.1371/
journal.pone.0000718
29. Zhang P, Dreher K, Karthikeyan A et al (2010)
Creation of a genome-wide metabolic pathway
database for Populus trichocarpa using a new
approach for reconstruction and curation of
metabolic pathways for plants. Plant Physiol
153:1479–1491
30. Tsesmetzis N, Couchman M, Higgins J et al
(2008) Arabidopsis reactome: a foundation
knowledgebase for plant systems biology. Plant
Cell 20:1426–1436
31. Masoudi-Nejad A, Goto S, Endo TR et al
(2007) KEGG bioinformatics resource for
plant genomics research. Methods Mol Biol
406:437–458
32. Sakurai N, Ara T, Ogata Y et al (2011) KaPPAView4: a metabolic pathway database for representation and analysis of correlation networks
of gene co-expression and metabolite coaccumulation and omics data. Nucleic Acids
Res 39:D677–684
33. Wurtele ES, Li L, Berleant D et al (2007)
MetNet: systems biology software for
34.
35.
36.
37.
38.
39.
40.
Arabidopsis. In: Nikolau BJ, Wurtele ES (eds)
Concepts in plant metabolomics. Springer,
Berlin, pp 145–158
Grafahrend-Belau E, Weise S, Koschützki D
et al (2008) MetaCrop: a detailed database of
crop plant metabolism. Nucleic Acids Res
36:D954–958
Shinbo Y, Nakamura Y, Altaf-Ul-Amin M et al
(2006) KNApSAcK: A comprehensive speciesmetabolite relationship database. In: Saito K,
Dixon RA, Willmitzer L (ed) Plant metabolomics. Berlin, Springer, pp 165–181. doi:
10.1007/3-540-29782-0_13
Karp P, Paley S, Romero P (2002) The pathway tools software. Bioinformatics 18:
S225–S232
Mueller LA, Zhang P, Rhee SY (2003) AraCyc.
A biochemical pathway database for
Arabidopsis. Plant Physiol 132:453–460
Zhang P, Foerster H, Tissier C et al (2005)
MetaCyc and AraCyc: metabolic pathway
databases for plant research. Plant Physiol
138:27–37
Müller HM, Kenny EE, Sternberg PW (2004)
Textpresso: an ontology-based information
retrieval and extraction system for biological
literature. PLoS Biol 2:e309
Rhee SY, Wood V, Dolinski K et al (2008) Use
and misuse of the gene ontology annotations.
Nat Rev Genet 9:509–515
Chapter 5
Bioinformatic Tools in Arabidopsis Research
Miguel de Lucas, Nicholas J. Provart, and Siobhan M. Brady
Abstract
Bioinformatic tools are an increasingly important resource for Arabidopsis researchers. With them, it is
possible to rapidly query the large data sets covering genomes, transcriptomes, proteomes, epigenomes,
and other “omes” that have been generated in the past decade. Often these tools can be used to generate
quality hypotheses at the click of a mouse. In this chapter, we cover the use of bioinformatic tools for
examining gene expression and coexpression patterns, performing promoter analyses, looking for
functional classification enrichment for sets of genes, and investigating protein–protein interactions.
We also introduce bioinformatic tools that allow integration of data from several sources for improved
hypothesis generation.
Key words Transcriptomics, Bioinformatics, Proteomics, Protein–protein interactions, Coexpression,
Functional classification, Functional genomics, Promoter analysis, Subcellular localization
1
Introduction
Plant biology, like other areas of biology, has undergone a large
transformation in the past decade, driven by high-throughput
methods for data generation, especially in the areas of genome and
epigenome analysis, transcriptome and proteome profiling, determining protein–protein interactions, and metabolome determination. Many data sets have been generated, and while each individual
set has been of tremendous use to the plant biologist who created
it, in aggregate these publicly available data sets are also of great
value to plant biologists around the world for querying in the context of their biological questions. Obviously, such large data sets
cannot provide a complete understanding of a given biological
question, but they can be leveraged to help plan experiments or to
generate hypotheses in silico, which can be rapidly tested in the lab
with the wide range of molecular techniques and genetic resources
that have been developed over a similar time frame. This chapter
provides an overview of web-based tools for querying data sets
generated by researchers, often funded by the National Science
Foundation Arabidopsis 2010 project in the USA, whose objective
Jose J. Sanchez-Serrano and Julio Salinas (eds.), Arabidopsis Protocols, Methods in Molecular Biology, vol. 1062,
DOI 10.1007/978-1-62703-580-4_5, © Springer Science+Business Media New York 2014
97
98
Miguel de Lucas et al.
was to identify the functions of 25,000 genes in Arabidopsis by
2010 [1], and by the AtGenExpress Consortium, an international
effort to measure the Arabidopsis transcriptome under many
conditions and in different tissues.
Here, we emphasize web-based tools that are well cited and
which tend to integrate data from several sources, for while many
researchers have set up project-based databases on websites, resources
that draw from many sources are often more useful to the typical
Arabidopsis researcher. We won’t describe well-developed sequence
databases as these are covered in a chapter by Eva Huala and
colleagues elsewhere in this Arabidopsis Protocols. The SIGnAL
website at http://signal.salk.edu/ [2] and TAIR website at http://
www.arabidopsis.org [3] are two very useful websites for exploring
sequences and identifying insertions, among their greater utility in
this regard. Instead, we focus on tools for querying transcriptome
data sets, which are the most comprehensive of all of the large data
types, and highlight tools for querying these both in a directed way
and correlatively. Such tools can be very useful for narrowing down
the phenotypic search space or for providing leads on “novel” genes
associated with a given biological process, respectively. We also look
at several tools for exploring protein–protein interactions in
Arabidopsis and for performing promoter analyses. Tools for integrating different data types to improve function prediction are key
to extracting even more knowledge from these data sets, and two
such tools will also be covered. We use as an example the gene
ABSCISIC ACID INSENSITIVE 3, At3g24650 [4], as our “gene
of interest.” Although this gene is well known to be involved in seed
biology, we will hypothesize some more functions for it using the
tools described here, often at the cost of only a click of the mouse.
The programs and websites discussed in this chapter are listed in
Table 1 in the Subheading 2. Two additional useful review articles in
the context of bioinformatic tools for hypothesis generation are by
Brady and Provart [5] and by Usadel and colleagues [6].
2
Materials
Materials used in this protocol are indicated in Tables 1 and 2
[7–21].
Additionally, we use a list of genes differentially regulated in a
LEC1 overexpressor line as outlined in [22].
3
Methods
3.1 Expression
Analysis
Online expression analysis can be useful in place of performing
RNA blot analyses or constructing promoter–reporter fusions to
determine patterns of expression. For instance, imagine we had
Bioinformatic Tools in Arabidopsis Research
99
Table 1
Tools, URLs, and references
Methods
Tool
Web
Reference
Expression
analysis
eFP Browser
Genevestigator
bar.utoronto.ca/efp/cgi-bin/efpWeb.cgi
www.genevestigator.com/gv/
[7]
[8]
Promoter
analysis
Cistome
www.bar.utoronto.ca/cistome/cgi-bin/BAR_
Cistome.cgi
www.bioinformatics2.wsu.edu/cgi-bin/Athena/cgi/ [9]
home.pl
Athena
Coexpression
tools
ATTED II
atted.jp/
Expression Angler bar.utoronto.ca/ntools/cgi-bin/ntools_expression_
angler.cgi
[10]
[11]
Functional
classification
AgriGO
AmiGO
[12]
[13]
Classification
SuperViewer
bioinfo.cau.edu.cn/agriGO/
amigo.geneontology.org/cgi-bin/amigo/
term_enrichment
bar.utoronto.ca/ntools/cgi-bin/ntools_
classification_superviewer.cgi
Pathway
visualization
AraCyc
MapMan
www.plantcyc.org/
[15]
mapman.gabipd.org/web/guest/mapman-download [16]
Protein
information
SUBA III
suba.plantenergy.uwa.edu.au/
Cell eFP Browser bar.utoronto.ca/cell_efp/cgi-bin/cell_efp.cgi
Protein–protein Arabidopsis
interaction
Interactions
Viewer
NBrowse
bar.utoronto.ca/interactions/
www.arabidopsis.org/tools/nbrowse.jsp
Integrated tools VirtualPlant
GeneMania
ePlant
virtualplant.bio.nyu.edu/cgi-bin/vpweb/
www.genemania.org/
http://bar.utoronto.ca/eplant/
[14]
[17]
[7]
[18]
[19]
[20]
[21]
Table 2
ABI3 developmentally coexpressed genes
AT4G27160 AT4G27460 AT4G27150 AT1G80090 AT1G03890 AT3G62730 AT1G32560
AT2G33520 AT5G55240 AT3G44830 AT3G22640 AT5G50600 AT4G10020 AT2G38905
AT1G14950 AT5G54740 AT1G05510 AT3G54940 AT5G10140 AT5G24130 AT1G29680
AT4G27140 AT1G17810 AT5G01300 AT1G54860 AT2G41070 AT1G04560
AT2G23640 AT1G48130 AT5G01670 AT2G34315 AT5G57390 AT2G21490
AT2G02120 AT5G50360 AT3G18570 AT1G52690 AT1G27461 AT1G62710
AT4G26740 AT1G65090 AT2G02580 AT3G14360 AT5G60460 AT2G28490
AT5G24950 AT2G27380 AT1G73190 AT3G24650 AT4G16160 AT4G31830
100
Miguel de Lucas et al.
identified the abi3 mutation by positional cloning and wanted to
know more about its biological function and perhaps to guide us
where to look elsewhere for a phenotype. One of the first steps
would be to examine its expression pattern. Online tools such as the
eFP Browser or Genevestigator makes this very easy, provided
the platform used for measuring the transcriptome is able to detect
the transcript for one’s gene of interest (see Note 1).
3.1.1 eFP Browser
The eFP (“electronic fluorescent pictograph”) Browser at the
Bio-Analytic Resource for Plant Biology at http://bar.utoronto.ca
[7] provides easy access to 80.2 million expression measurements
from Arabidopsis thaliana, soybean (Glycine max), barrel medic
(Medicago truncatula), poplar (Populus trichocarpa), maize (Zea
mays), barley (Hordeum vulgare), and rice (Oryza sativa). Fourfifths of the measurements were made using Arabidopsis samples.
Small pictographs are used to represent the experimental samples
and contexts from which the expression data were generated,
while differing expression levels within these samples are denoted
by a color scale.
1. Go to http://bar.utoronto.ca and select “Arabidopsis eFP
Browser” from the BAR’s homepage.
2. Enter your gene of interest’s AGI ID (see Note 2).
In our case, we enter “At3g24650” for the ABI3 gene into
the Primary Gene ID box. Click Go.
3. Figure 1 shows the output when querying the eFP Browser
using ABI3 in the default settings. The tissues that were sampled
by Schmid et al. [23] for their “gene expression map during
Arabidopsis development” and by Nakabayshi et al. [24] for
the dry and imbibed seed samples are depicted in a pictographic manner. Where the expression (expression meaning
steady-state mRNA levels) of ABI3 is higher, the more red is
that tissue’s color. If there is little expression in a tissue, then it
is colored yellow.
4. By changing the data source, it is possible to explore other data
sets that have been annotated in this pictographic manner.
The eFP Browser also outputs where the expression of the
gene of interest is strongest (in this case, in the Seed Data
Source, not surprisingly, given ABI3’s known role there), but it
is also worthwhile to examine other data sources (see Note 3).
For instance, ABI3 also seems to be expressed in the vasculature
tissue between the elongation and maturation zone of the
root. If it had not already been known [25] that ABI3 is
involved in root development, such an observation of expression in the root could guide us to look for phenotypes in the
roots of abi3 mutants more closely.
5. The Relative Mode option allows you to view expression of a
given gene in each sample relative to its expression in a control
Bioinformatic Tools in Arabidopsis Research
101
Mode allows viewing in Absolute,
Relative, and Compare modes
Signal Threshold to set a
maximum for colour scale
Linkouts to other tools
Data Source to choose different
AtGenExpress projects and other
projects, e.g. Birnbaum et al . set.
Expression Level distribution
shows how maximum level
compares with all other genes,
and maximum level in any set
Clickable tissues hyperlink to
the NASCArrays, GEO or
literature record for the sample
Expression Level scale
red=higher expression level
Buttons to view a table or chart
of expression values
Fig. 1 “Default” view of expression pattern of ABI3 (At3g24650) in Arabidopsis. Stronger expression is denoted
by a darker coloration. The interface provides many options for exploring the expression data, as shown in the
callout boxes (see Note 3)
sample and to ascertain whether the gene’s expression is above
or below this level. If it is above, a red color is used, and if it is
below, a blue color is used to color the tissue in question. For
the Developmental Map, this level has been computed as the
median level across all of the tissues displayed. The Relative
Mode is more useful in the case of “challenge” experiments,
where a hormone or chemical has been applied as part of the
experimental design. The control sample in this case would be
the mock treated or untreated control.
6. If a given gene does not map to an ATH1 probe set, then try
using the “Developmental Map At-TAX” Data Source. These
data were generated using a different platform, so it should be
possible to get an idea of where any gene is expressed using
this or the “Abiotic Stress At-TAX” Data Sources [26, 27].
3.1.2 Genevestigator
Data from more than 8.000 ATH1 arrays is available for
Arabidopsis at the Genevestigator analysis tool (https://www.
genevestigator.com/gv/) [8]. Similar to the eFP Browser, the
different tools of this resource let us determine when and where
our gene of interest is expressed and in response to which conditions. The main difference between the eFP Browser and
Genevestigator is that data are displayed in heat-map format as
102
Miguel de Lucas et al.
opposed to a pictograph. One of the major advantages of this
tool is the simultaneous analysis of hundreds or thousands of
genes in a biological context as opposed to the eFP Browser,
which permits a user to examine only one gene at a time (or two
genes in the compare mode—see Note 4).
1. Go to https://www.genevestigator.com/gv/ and select “Plant
Biology.” Click Analysis Tool Start. Click Start under the Open
access version.
2. Click on Sample Selection, click “new” to chose Arabidopsis as
the Organism, “ATH1: 22K array” as platform. Alternatively,
one can select the AGRONOMICS whole-genome tiling array
or the AG: 8K array (note that no results will appear for
ABI3 on the whole-genome tiling array). Name your selection, i.e., ABI3 study (see Note 5). Click OK.
3. In the Gene Selection tool we will introduce the AGI ID by
clicking on “new” (see Note 6). In our case, we enter the ABI3
AGI ID, “At3g24650.” Click OK.
4. The Condition Search tools give us gene expression data from
the different arrays sets (see Note 7), the filled dots indicate
p-values under 0.06 and the unfilled p-values over 0.06 (see
Note 8). Choose for example, “Samples” to explore gene
expression on all the arrays available. To get the experimental
design and gene expression information, just drag the mouse
over the sample name or the dot.
5. Click on the different tabs to explore the ontologies of anatomy,
genotypes, condition, and development. The expression of
ABI3 is high in the seed arrays, principally in the embryo and
endosperm, rather than in the seed coat. By genotype, ABI3 is
highly expressed in the pER8:LEC1 overexpression line and
repressed in lec1.1 plants; by contrast ABI3 has lower expression in the pif1/pif3/pif4/pif5 quadruple mutant plants. ABA
treatments promote its expression, as does the treatment with
paclobutrazol (GA inhibitor).
6. We can generate hypotheses from these data: phytochromemediated light signalling and downstream factors regulate
ABI3 expression and LEC1 likely regulates ABI3 expression
either directly or indirectly.
3.2 Coexpression
Tools
Coexpression analysis can leverage the large number of gene
expression data sets that have been generated in the past decade to
answer the question “which genes show similar patterns of expression as my gene of interest, across all samples in a given database?”
Those that show similar patterns of expression may be involved in
the same biological process as the query gene, after the “guilt-byassociation” paradigm. The use of such analyses is well covered in
a recent review by Usadel and colleagues [6].
Bioinformatic Tools in Arabidopsis Research
3.2.1 Expression Angler
103
Expression Angler [11] is a powerful yet easy-to-use tool for
identifying coexpressed genes, as measured by the Pearson correlation coefficient—r—in both a condition-dependent and conditionindependent manner (see Note 9). With it, it is possible to answer
the question of which genes show similar patterns of expression in
nine different data sets—genes with an r-value of greater than
around 0.75 can be considered coexpressed. It is also possible to
use just a subset of the samples within a given data set to perform
the analysis, which we will do below for ABI3. Those genes annotated as “unknown function” or those with only vague descriptions
may be involved in the same process as the query gene.
1. Go to the Bio-Analytic Resource for Plant Biology’s homepage
at http://bar.utoronto.ca and select the Expression Angler link.
2. In normal use, select a data set and enter the AGI ID of interest. If we had used the AtGenExpress Tissue Set, which corresponds to the data set shown in Fig. 1, we would identify
many other seed maturation genes and ABA-responsive genes
being coexpressed with ABI3—the top 50 of these are listed in
Table 2. Another way to use Expression Angler, however, is to
define a subset of samples in which to search. Use the “Subselect
and Custom Bait Page” link, and then choose a data set. In this
case we will use the Root Compendium. On the input page,
we will enter At3g24650 and then select “Return just the top
50 hits” in only the “Spatiotemporal expression” experiment
[28] (see Note 10).
3. Click “Submit Query” at the bottom of the page.
4. On the output page, examine the “View formatted data set
after median centering and normalization,” as shown in Fig. 2.
This view is closest to the way that Expression Angler “sees”
expression pattern similarity with the Pearson correlation coefficient, which standardizes gene expression values by the average
value (not median) when comparing two expression vectors.
Another useful view is the “View formatted data set,” which
shows the untransformed expression levels.
5. By mousing over the heat map, it is possible to find out the
annotation of the genes, which samples they are expressed most
strongly in, and other information. Interestingly, YABBY3,
likely a patterning gene, shows up as being coexpressed with
ABI3, as are several other transcription factors.
3.2.2 ATTED II
ATTED II [10] is a gene coexpression database to find functional
relationships between genes. This tool uses the mutual rank (MR)
of the Pearson’s correlation coefficient [29], to investigate gene
coexpression in Arabidopsis in a condition-independent way or
across five sets of experimental conditions: tissue, abiotic stress,
biotic stress, hormones, and light conditions. ATTED II also offers
104
Miguel de Lucas et al.
Info Box shows information
about for a given cell in heatmap,
including r-value (0.862 for YAB3)
Signal Threshold to set a
maximum for colour scale
Functional Classification Code
shows into which GO categories
a given gene has been classified
(grey = process, white = function, yellow = location)
Crosshairs as a guide for
pinpointing a particular cell in the
heatmap
Expression Level scale
red=higher expression level
Functional Classification Legend
shows enriched GO terms for list
Fig. 2 Heat-map output of Expression Angler after searching in just the Root Spatiotemporal data set of
Brady et al. [28] with ABI3
analysis of rice coexpression data to provide a comparative view
between both species using putative gene orthologs.
1. Go to http://atted.jp/.
2. On the search menu, click on the arrow(s) on the right-hand
side of the pull-down menu and select the option that best fit
to your search (“All words,” “Keyword,” “Gene alias,” “Gene
ID,” or “GO ID”). We will search by “GeneID,” At3g24650
for ABI3. Click Search.
3. The output window shows a brief description of the gene of
interest, like the alias and the function. By selecting “locus,”
ATTED II sends us to a new window with much information
about the gene: functional annotation, a gene coexpression
network, gene expression levels, and predicted cis-elements.
4. For a more extensive analysis of coexpressed genes, go back to
the locus search window and click on “list” of coexpressed
genes. The program will give a list of the top 300 coexpressed
genes (see Note 11).
5. Check “coex in specific conditions” to study coexpression under
different conditions: tissue, abiotic stress, biotic stress, hormone,
Bioinformatic Tools in Arabidopsis Research
AGI IDs of the
co-expressed
genes.
Network View of
the co-expressed
genes. Click to
generate view.
Sorted coexpression data.
Use “sort” to get co-expressed
genes in different biological
context.
105
Link to ATTED II
gene information for
Rice homologs
Fig. 3 Output of an ATTED II query for the ABI3 gene, showing ranked list of coexpressed genes in ATTED II’s
condition-independent data set (top panel) and a visualization of the coexpression list in network form (insert)
and light. We can rank coexpression in each condition by clicking
on “sort.” This approach would help us to infer the gene function in each category. For instance the genes that are more
closely correlated to ABI3 differ extensively depending on which
biological context in which we are interested. This suggests
that ABI3 has multiple functions—both developmentally and
in response to the environment, i.e., if we sort by “tissue,”
ABI3 is coexpressed with several seed-associated genes,
whereby different genes show up at the top of the ABI3coexpressed lists under hormone treatments or abiotic stress.
6. Tick “Osa homolog” to see the homologous genes in rice.
The output window will show you the 300 top coexpressed
genes in rice.
7. Click on the small “L”-shaped icon in the Link column for each
coexpressed gene to get the same information described in step 3.
One of the most powerful features of ATTED II is the network
visualization of coexpressed genes. This network describes in a
clear manner genes connected directly and indirectly to our query
gene by coexpression. We can explore coexpression network
neighborhoods by clicking on the gene names (see Fig. 3).
106
Miguel de Lucas et al.
8. ATTED II shows that ABI3 is coexpressed with EPR1 (an
extensin-like gene) that is involved in seed germination but only
expressed on the endosperm [30]. AIL5 (AINTEGUMENTA
LIKE-5) appears to be coexpressed with ABI3 as well. AIL5
encodes a member of the AP2 family of transcriptional regulators that are involved in cell proliferation activities in many
organs [31]. AIL5 mutants are tolerant to ABA. We can therefore hypothesize that ABI3 and AIL5 interact together to control cell proliferation and/or ABA response.
3.3 Promoter
Analysis
Gene expression is dependent on the cis-regulatory elements present in the promoter regions of genes that act as binding sites for
one or more transcription factors. Many tools were developed to
better understand how these transcription factor binding sites
might regulate gene expression. In this section we will introduce
tools that will help us to analyze and visualize promoter regions of
Arabidopsis genes.
3.3.1 Cistome
Imagine a set of genes that are coexpressed in response to a certain
stimulus. It will be of interest to determine common upstream
regulatory motifs between these genes that could explain this particular behavior and identify putative upstream regulators. Cistome
is a tool that searches for enriched motifs in the promoter regions
of these genes.
1. Go to http://www.bar.utoronto.ca/cistome/cgi-bin/BAR_
Cistome.cgi.
2. We will need to specify the analysis that Cistome should perform
with our gene list. If we are interested in studying whether a
particular motif is overrepresented in the promoter regions of
our gene set, click on “Enter PSSMs” (position-specific scoring matrices; this is a more flexible way to represent transcription factor binding sites and describes the probability of how
often a given nucleotide can be present at each position of the
motif), and enter the search sequence in the format required.
For instance, we may be interested in the G-box motif
(CACGTG), which is a binding site for the PIF transcription
factor family [32]. This will assist us in exploring whether or
not the genetic association of ABI3 with the PIF transcription
factors outlined in the Genevestigator section may be through
binding of this light-regulated family of transcription factors to
the set of ABI3 developmentally coexpressed genes in Table 2.
Select the Consensus sequence option. We will add the motif
sequence in FASTA format (> GBOX and then on a new line,
CACGTG—see the Format example link for help), select the
Consensus sequence option, tick the Significance testing
option, and finally enter the AGI ID list in the List of AGI
identifiers to search box and click Map. We find that this motif
Bioinformatic Tools in Arabidopsis Research
107
is in fact overrepresented in the promoters (of length
1,000 bp) of ABI3 developmentally coexpressed genes by a
Z-score of 7.32, which is highly significant. On average one
should use a threshold of greater than 3 to represent at least 3
standard deviations away from the mean of the number of times
a given motif occurs in a random sampling of all of the promoters in Arabidopsis. This provides a prediction that a PIF transcription factor binds the promoter of genes that are coexpressed
with ABI3 to regulate their expression in the light.
3. Alternatively, by clicking on the Use Prediction tab at the top of
the Cistome page, we can screen against all possible motifs identified using one of two parts of a previously characterized motif
database, PLACE [33]. The first part uses All PLACE Elements,
which contains motifs identified in all plants. The second part
uses a subset of these which have just been identified in
Arabidopsis. We recommend using the entire PLACE database
(see Note 12) to get a larger breadth of possible elements.
Alternatively one can identify overrepresented uncharacterized
elements by clicking Cis Scan to activate cis-motif prediction
programs available on Cistome. To map known PLACE elements onto our promoter list of ABI3 developmentally coexpressed genes from Table 2, tick “Search for enriched PLACE
database elements within your gene set,” and search for enriched
motifs using “ALL PLACE elements.” You can also specify the
significance parameters. In our example analysis we will use the
default parameters, which include a Z-score cutoff of greater
than 3 and a functional depth cutoff of 0.35, and that this motif
must be found in at least half of the genes in the gene set.
4. We will need to specify the gene data set that Cistome will use as
a background. We choose to use the last updated version of TAIR
which is available on Cistome—at the time of writing this chapter
the version was TAIR9. Indicate the length of sequence that will
be used for the analysis (e.g., 1,000 bp) using the “transcriptional start site (TSS)” as a start position. The majority of
binding sites have been identified in the first 500–1,000 bp
upstream of a gene’s transcriptional start site [34–36]. One
can also specify a custom background set by uploading a file
with sequences in it.
5. Click on “Map” and Cistome will display a diagram with the
overrepresented regulatory elements mapped on the promoters of the genes included in the analysis. Overrepresentation is
determined by comparing the frequency of occurrence of each
motif against the frequency of occurrence of the same motif in
randomly selected sets of promoters from the background set.
Click on “Cluster” and Cistome displays a cluster with the
overrepresented sequences based on sequence similarity conservation. Click on “Logo” to get the frequency of the distinct
nucleotides that are found in the overrepresented binding
108
Miguel de Lucas et al.
Click on AGI IDs
for locus
information.
Promoter Maps show
position of regulatory
elements on the promoter
regions of query genes
Click on motifs for
sequence information
including a sequence
logo representation
Colour key for overrepresented
regulatory elements
(darker shading for better
match)
Fig. 4 Output of a Cistome query that represents the overrepresented regulatory elements mapped onto the
promoters of ABI3 developmentally coexpressed genes. Pink represents the ERD1 motif, while green represents the RYE motif
elements. Once you have a given sequence motif you can identify
other genes in the genome that may contain this element. You
can then query coexpression databases to see if these genes are
coexpressed with your gene of interest or, in our case, if they
are coexpressed with ABI3 under any other conditions.
6. Visualization of multiple overrepresented elements can further
determine whether there are any co-localized elements within
the promoters of these coexpressed gene promoters. This could
indicate potential combinatorial gene regulation. For instance,
the pink element (ERD1 motif) and the green element (RYE
motif) are located beside each other in many of the coexpressed
genes (see Fig. 4).
3.3.2 Athena
The Athena analysis tool [9] from the Wyrick Laboratory at
Washington State University integrates DNA sequence and Gene
Ontology (GO) data to facilitate the analysis of 30,077 predicted
Arabidopsis promoter sequences and 105 different transcription
factor binding sites (see Note 13). We will use Athena to identify
transcription factor binding sites present on the ABI3 promoter
and then will identify common TF binding sites of genes developmentally coexpressed with ABI3 (see Table 2).
Bioinformatic Tools in Arabidopsis Research
Analysis box allows
access to Motifs and
Frequency
information tools.
Click Download
Promoters to get the
promoter
sequence(s) of
gene(s) of interest.
Diagrammatic
representation of
potential TF binding
sites present in the
promoter of the
gene(s) of interest.
Different TFBSs are
denoted with different
colors.
109
Lists of non-enriched
or enriched TF
binding sites present
in the promoter of
gene(s) of interest. In
this case, no TFBSs
are enriched in the
promoter of the ABI3
gene.
Fig. 5 Graphical output from Athena showing potential transcription factor binding
sites in the ABI3 promoter
Transcription Factor Binding Sites on the ABI3 Promoter
1. Go to http://www.bioinformatics2.wsu.edu/cgi-bin/Athena/
cgi/home.pl and click the Visualization tab in the menu bar
along the top.
2. Enter your gene of interest’s AGI ID, “At3g24650” in our case
(ABI3), into the Accessions box (see Note 14). Select
“Compact” for the visualization type (for more detail about the
structure of the promoter, select “Cartoon”), and choose the
Maximum bp upstream range of the promoter to be “3,000 bp”
in the Upstream Range box. Tick “Cut-off at adjacent genes”
to truncate the promoter when it overlaps with the next gene
upstream. Click Display.
3. The output window has three boxes (see Fig. 5):.
–
The “Analysis box” gives a text output of the analysis.
Click “Motifs” to get information of the sequences and
positions of all selected TF binding sites in the promoters.
Click “Frequency” to get the data of the frequency of
promoters genome-wide containing the TF sites and
the calculated p-value. To get the promoter sequence of
the gene, click “download promoters.”
110
Miguel de Lucas et al.
–
The “Selected Promoters” box provides a graphical
representation of the predicted TF binding sites in the
promoter.
–
The “TF box” lists the present and significantly overrepresented binding sites in the promoter with p-value calculated using the hypergeometric probability distribution,
name of the binding sites, and number (#S) of times
each specific binding site is present on the promoter.
The p-value can be additionally useful when one inputs a
group of genes that are coexpressed with the gene of
interest. This will let you identify putative common transcriptional regulators by statistically significant overrepresentation of binding sites in the promoters of the group
of genes.
4. Athena identifies 20 binding sites on the ABI3 promoter
(3,000 bp). The binding site with the lowest p-value is the
“Z-Box promoter motif” (p-value = 0.042, motif = ATACGTGT).
We recommend using a p-value threshold of less than 0.01 in
which case this motif is not significantly enriched in this promoter given its overall presence in the promoters of the
genome. If it did pass this threshold value then we could predict that a bZIP transcription factor might be a likely transcriptional regulator of ABI3.
Upstream Co-regulator Identification Using a Coexpressed
Gene List
The Athena Analysis Suite tool is a powerful interface where you
can find genes with common TF binding sites in their promoters.
You can analyze the entire Arabidopsis genome or use your own
list of genes. For the purpose of this chapter, we will find genes
that contain the ABRE binding site motif (see Note 15) in the promoter, from the list of ABI3-coexpressed genes generated in
Subheading 3.2.1 (see Table 2).
1. Go to http://www.bioinformatics2.wsu.edu/cgi-bin/Athena/
cgi/home.pl and click Analysis Suite.
2. The “Accessions” box contains the options for gene list
selection, tick “Use a subset,” and paste the list of genes from
Table 2. Select ABRE binding site motif from the list of TF
motifs present in the “Transcription Factors” box, and click
“Add TFs.” You can include GO terms in the “Gene Ontology”
box. Multiple TF motifs and/or GO terms can be included in
the analysis.
3. We can constrain the TF search to specific positions in the
promoter sequence using the “Motif Positions” box. The Start
and End numbers indicate the beginning and end of the
positional search constraints, respectively. As done earlier in
this section using the Athena Visualization tool, choose the
Bioinformatic Tools in Arabidopsis Research
111
Maximum bp upstream range of the promoter to be “3,000 bp”
in the “Range Selection” box, and select the “Cut-off at
adjacent genes” option. Click Submit.
4. Seven of the ABI3-coexpressed genes (At1g48130, At2g21490,
At4g16160, At5g10140, At5g24130, and At5g50360) contain the ABRE binding site motif. Click on the AGI ID to open
the TAIR gene information. At5g10140 encodes Flowering
Locus C (FLC), which functions as a repressor of floral transition. ABA has been previously demonstrated to delay flowering by affecting the transcript level of FLC [37].
3.3.3 TAIR Motif Analysis
Athena and Cistome analyze promoters for overrepresented previously validated or characterized regulatory elements. Cistome also
provides access to five other prediction programs. The Motif
Analysis algorithm from TAIR provides an alternate source by
searching for overrepresented 6-mer oligos in upstream regions of
genes.
1. Go to http://www.arabidopsis.org/tools/bulk/motiffinder/.
2. Add your list of genes by typing the AGI ID or the sequences
in FASTA format. Here we will use the list of genes in Table 2.
3. Indicate the length of the regulatory sequence that will be
included in the analysis (e.g., 3,000 bp), and select the output
file (e.g., HTML). Click Submit.
4. Motif Analysis from TAIR identifies statistically overrepresented 6-mer oligos occurring in three of more sequences in
the gene set. The overrepresented 6-mers are sorted by p-value
determined by comparing against a binomial distribution, and
genes with that particular sequence are indicated.
3.4 Functional
Classification
Functional classification of gene lists is one of the basic methods in
bioinformatics for making sense of sometimes rather large gene
lists that arise from gene expression profiling experiments. Typically,
one might look at individual genes in such lists and “see if it fits
biologically,” but one might also like to have an overview of broad
functional categories that change in response to a given stimulus or
due to a specific mutation. One of the very useful large initiatives
of the past decade was the development of a Gene Ontology (GO)
for the “unification of biology” [38]. Basically, this system is a set
of categories, which are described using defined terms instead of in
a free-form manner, into which genes can be assigned. There are
three main super-categories: biological process (BP), molecular
function (MF), and cellular component (CC). TAIR has been the
main curator for GO annotations for Arabidopsis genes, with some
input from other groups. A gene may belong to several categories
and subcategories at once, which are arranged in a hierarchical
manner from very general to very specific terms (technically, the
112
Miguel de Lucas et al.
relationships between categories and sub-categories are formalized
as a directed acyclic graph). It is possible to use statistical tests,
often a hypergeometric test, to assess whether or not the number
of genes observed associated with a given term (i.e., category)
from one’s list of interest is enriched relative to the number one
might expect to see by chance. Such a test can be used for any classification system which has categories into which things are classified. Another system of classification called MapMan Bins was
initiated by Björn Usadel and colleagues at the Max Planck Institute
for Molecular Plant Physiology in Germany [16]. A variation on this
approach is to examine genes whose expression is altered in response
to a perturbation in the context of the biological pathways to which
they belong.
3.4.1 AgriGO
AgriGO [12] out of Zhen Su’s laboratory at the Chinese
Agricultural University is a user-friendly tool for analyzing whether
any particular GO terms are enriched in a given gene list from
Arabidopsis (or for many other agriculturally important species).
It provides a nice visualization in the same directed acyclic graph
structure on which the GO system was developed.
1. Go to http://bioinfo.cau.edu.cn/agriGO/ and select “Analysis
Tool” in the tab along the top.
2. In the first section for selecting the analysis tool, select “Singular
Enrichment Analysis (SEA).”
3. Select the species (the default is Arabidopsis thaliana).
4. Paste in the Query list as AGI IDs, gene aliases (e.g., ABI3),
GenBank IDs, etc. A large number of different identifiers are
supported.
5. Choose a reference—if the list comes from a microarray experiment—and then choose the appropriate microarray platform;
otherwise if the list comes from an experiment where it is possible to identify any of the AGI IDs present in the TAIR genome
annotation (such as the case with a proteomics experiment or an
mRNA-seq experiment), then choose the “Arabidopsis genome
locus (TAIR)” option—this aspect is a nice feature of AgriGO.
In this example, we will submit the top 50 genes coexpressed
with ABI3 in the AtGenExpress Tissue Set as discussed in
Subheading 3.2.1, step 1 (see Table 2). As the data used to
obtain the coexpressed genes come from the Affymetrix ATH1
platform, we use this platform as our reference.
6. Under “Advanced Options—optional” one can select one of
three methods for statistical enrichment (hypergeometric
distribution, Fisher’s exact, or chi-square) as well as one of seven
multiple hypothesis testing correction methods. We recommend the use of the Storey Q-value method.
Bioinformatic Tools in Arabidopsis Research
113
GO Directed Acyclic Graph
shows partial GO structure with
only relevant enriched terms
Enriched GO Terms are
coloured based on significance
Click on GO Term to see all of
the genes associated with that
term (not just those in input list)
Legend
red=more significant
Fig. 6 Graphical output from AgriGO for the top 50 ABI3-coexpressed genes in the AtGenExpress Tissue Set
from Subheading 3.3.2, step 2. The GO term “Lipid localization” (red) is most significantly enriched among
these genes (see Note 16)
7. In the output, a table of enriched GO categories for our list of
50 genes is displayed showing that four GO biological process
terms (lipid localization, response to abscisic acid stimulus,
macromolecule localization, postembryonic development) and
two GO molecular function terms (nutrient reservoir activity,
lipid binding) are significantly enriched. Examining these, they
seem to “make sense” in the context of the later stages of seed
development, when ABI3 and these genes are expressed, insofar as this is the time when lipid reserves are being accumulated
and the seed begins to dessicate, etc. There is also the possibility
to create “Graphical Results” or a “GO Flash Chart.” If we
click on the Generate Image button, the following output is
generated for enriched biological processes (see Fig. 6).
3.4.2
AmiGO
AmiGO [13] provides a generic interface for computing GO term
enrichments for all of the species annotated by the GO Consortium.
1. Go to http://amigo.geneontology.org/cgi-bin/amigo/
term_enrichment.
114
Miguel de Lucas et al.
Click “view” to
visualize results as a
“directed acylic
graph” diagram
Enriched GO terms in
dataset. Mouse-over the
GO term to get information
about each particular term
Degree of
confidence and
frequency of each
term
List of genes associated
with each GO term. Click
over the gene name for more
gene information
Fig. 7 Output from AmiGO shows the enriched GO terms in the data set, and the genes associated with
each GO term
2. Paste your gene identifiers into the “Input your gene products”
box. Using Genevestigator, we have shown that ABI3 expression depends on the presence of LEC1. To better understand
this ABI3-LEC1 relationship, we will use in this instance genes
whose steady-state transcript level is increased in LEC1 overexpressor (OX) plants from [22]. This will allow us to determine
biological functions associated with genes that are also overexpressed and likely downstream of LEC1. Select TAIR as the
database filter, and then submit the query. It is possible to
exclude GO annotations that have been inferred electronically
(IEA annotations) when performing the enrichment analysis
using this tool. Click Submit. The output for this gene list is
shown in Fig. 7.
3. Mouse over the gene’s ID to get more information (protein
sequence, TAIR link, etc.). At the top right of the gene information page, we can explore the terms associated with that
particular gene by clicking on “Terms association.” 46 genes
from the genes upregulated in LEC1OX plants are grouped
into “lipid metabolic process,” with a p-value of 8.26e-13
and a frequency of 11.1 % (2.7 % is the frequency of this term
Bioinformatic Tools in Arabidopsis Research
115
in the background). LEC1 is associated to this term, but it is
also associated with other ten different terms as, i.e., ABAmediated signalling, blue light signalling pathway, embryo
development, and others.
3.4.3 Classification
SuperViewer
The BAR’s Classification SuperViewer [14] provides a different
way to view Gene Ontology and MapMan classifications for lists
of genes, using a barcode scheme. Classification SuperViewer
barcodes are also integrated into several others of the BAR’s output tools.
1. Go to http://bar.utoronto.ca/ntools/cgi-bin/ntools_classification_superviewer.cgi and input your list of genes.
2. Select the classification scheme you wish to use under the
second point, either GO (actually GO Slim in the case of this
tool) or MapMan.
3. Leave the other options as they are, and click Submit Query.
4. The output page is divided into three parts: an overview table
showing which categories are enriched (by a hypergeometric
test with a p-value cutoff of 0.05) in bold, a chart area summarizing the category information in a different way, and a
detailed table section, which is linked from the overview area
(see Fig. 8). In these areas the grey background sections are
GO biological process terms, those with a white background
are GO molecular function terms, while those with a yellow
background are GO cellular component terms (this shading
scheme does not apply for MapMan terms).
5. In the Overview section, categories that are overrepresented
relative to the total number of instances of the term in the
overall GO or MapMan database (see Note 17) are bolded.
The relative enrichment is shown on the left, while the absolute
number of counts in a given category is on the right. The color
scheme for the categories is also used in the chart section and for
the bar code in the table section. In the case of a list of the top
50 genes coexpressed with ABI3 in the Developmental Map,
the Developmental Processes and Transport categories are
overrepresented as might be expected for the number of genes
in this list involved in the process of dormancy as seeds mature
and in transporting lipids to provide reserves for the seed when
it germinates. These categories are also seen with AgriGO.
6. The Chart section shows the overrepresented categories relative
to the frequency in the overall Arabidopsis genome or in terms
of absolute counts on the left and right side, respectively.
7. The Table sections show details for every single gene in the
input list. A bar code system using the same color scheme as in
the other two sections shows that in many cases a given gene
116
Miguel de Lucas et al.
Overview Tables show GO Slim
categories that are enriched with
a bolded p-value
Charts summarize GO Slim
information in another way
(grey = process, white = function, yellow = location)
Detailed Table is linked from
Overview table: genes in a
particular category are grouped
Fig. 8 Output of Classification SuperViewer for a list of the top 50 ABI3-coexpressed genes from a query of the
BAR’s Expression Angler tool in the AtGenExpress Tissue Set compendium (Table 2)
falls into several GO categories. Genes are grouped by category,
with the final bar on the right being the category used for
grouping. A gene will appear in this table as often as the number
of bars in its bar code. Mousing over a particular bar will provide
information on the actual GO term.
3.5 Pathway
Visualization
One of the biggest issues working with large-scale data sets is to
represent the information generated in a mode that is easily
visualized and from which one can quickly generate hypotheses.
In the context of metabolic pathways this is considerably important.
If a series of enzymes in a pathway are upregulated or downregulated, there is a greater chance that the metabolism of the
compounds associated with this pathway will be perturbed in a
corresponding manner. Pathway visualization tools were generated
to integrate an analyze data from large-scale experiments and place
that information in an easy-to-interpret metabolic context. In this
section we will introduce two different visualization tools used to
describe a wide set of Arabidopsis metabolic pathways.
3.5.1
AraCyc 8.0 [15] is the most comprehensive Arabidopsis-specific
metabolic database (see Note 18). We can use their tools to visualize
individual metabolic pathways, to view the complete metabolic
AraCyc
Bioinformatic Tools in Arabidopsis Research
117
Related Metabolic Pathways
show inputs and outputs for this
particular pathway in green text
Arabidopsis gene names are
coloured purple
Corresponding enzyme names
appear in orange text
Compounds are denoted by red
text
Fig. 9 Overview of ABA biosynthesis in AraCyc as resulting from a query with “abscisic acid biosynthesis”
map of Arabidopsis, or to predict metabolic pathways from a list of
genes. We will demonstrate how to use these three options to
characterize the role of ABI3 as it pertains to plant metabolism.
As ABI3 is highly expressed after treatment with abscisic acid
(ABA), we may be interested in learning more about genes that
function to synthesize ABA.
1. Go to http://www.plantcyc.org/.
2. In the search box write the name (or a keyword) of the pathway in which you are interested. In our case we will write
“Abscisic.” Then choose AraCyc as the metabolic database.
Click search.
3. The search results contain a window with a list of pathways,
proteins, compounds, and reactions that match with our word.
We just need to click on the one we want to explore, in our
case “abscisic acid biosynthesis” (see Fig. 9).
4. AraCyc shows a diagram with the enzymes (orange), compounds (red), genes (purple), and related pathways (green) of
the abscisic acid biosynthesis pathway. If we click on “more
detail” the molecular structure of the compounds appears on
the diagram. Below the diagram, we can find information about
the chromosomal localization of the genes in the pathway, a
brief description of the biological context of the pathway, and
the references AraCyc used to generate the pathway.
5. To get information about the enzymatic reaction in which the
gene is involved, click on the enzyme name (not the AGI ID).
This will take you to a new window with more information.
For instance, clicking on the 9-cis-epoxycarotenoid dioxygenase
will give all interactions in which this enzyme could be involved
in, as well as the enzymatic reactions of all closely related
homologs.
118
Miguel de Lucas et al.
6. To get detailed information on the gene through TAIR, double
click on the gene name. For instance, ABA4 (At1g67080)
encodes a neoxanthin synthase involved in the conversion of
violaxanthin into trans-neoxanthin, which is an early step in
ABA biosynthesis. We can expect that mutants in ABA4 have
reduced levels of ABA; hence, the expression of ABI3 will be
reduced too since it is ABA responsive [25]. Transcriptome
analysis of ABA4 mutants will be useful to study the plant’s
behavior in the absence of ABA to determine any correlation
with loss of ABI3 function.
In Subheading 3.1.2 we determined that ABI3 was upregulated in LEC1 overexpression plants (pER8-LEC1) [22]. We will use
the list of genes upregulated in LEC1-OX plants [22] to predict
metabolic pathways that LEC1 overexpression modulated with the
OMICs Viewer tool of the AraCyc database. These genes may act
with ABI3 to influence plant form or function.
1. Go to http://pmn.plantcyc.org/ARA/expression.html.
2. In the left part of the window, the OMICs Viewer summarizes
the type of data we can analyze. The file must be in tab-delimited
text format and the first column must be the locus name (e.g.,
At3g24650) and the second the expression value (see Note 19).
Click on “Browse” to upload the file. Choose “Relative” or
“Absolute” values to display. As we have only one column of
expression data, tick “a single data column.” As our data are
log2-transformed, we will use the “0-centered scale.” We are
using locus names in our data, so choose “Gene names and/or
identifiers” as the items that appear in the first column of our
data file. In our data file we only have one experiment, so type
“1” in the data columns box (if your data has multiple set of
values, type the numbers of the columns you want to display).
We can also play with color scheme options and display type.
We will leave the default options. Click “Submit.”
3. The output window shown in Fig. 10 shows a diagram with all
metabolic pathways of Arabidopsis. The OMICs viewer uses
red to represent highly expressed genes. Multiple genes involved
in s appear to be highly upregulated and overrepresented in our
expression data which suggests that GA biosynthesis may be
upregulated in the LEC1-OX plants.
4. To see in detail the pathways represented in our expression
data, go back to the main website of the OMICs viewer, and
check “Generate a table of individual pathways exceeding
threshold” and select a threshold value, e.g., 1.5-fold. Clicking
on a pathway shows the detail for it (see Fig. 11).
5. LEC1-OX appears to promote gibberellin biosynthesis though
the activation of genes involved in that metabolic pathway,
Bioinformatic Tools in Arabidopsis Research
Mouse over to identify the
metabolite or the reaction.
Click on metabolites to
navigate to the metabolite page
Lines represent
reactions. Line color
represents expression
level, as per legend
GA
biosynthetic
pathway
119
Each node represents a
metabolite.The shape of the
node represents the type of
metabolite
Fig. 10 Output of AraCyc’s OMICs viewer summarizing the increases and decreases in transcript abundance in
LEC1 overexpression plants
Red lines represent the
reactions that exceed the
threshold
Arabidopsis enzymes that
catalyze each reaction.
The colored ones are
represented in our dataset
Fig. 11 Detail generated by clicking on an overrepresented pathway in the OMICs Viewer
120
Miguel de Lucas et al.
such as GA20 oxidases 3 and 7. LEC1 acts as a positive regulator
upstream of ABI3 [39], as ABI3 is upregulated in LEC1-OX
plants. As we have shown using Genevestigator, the GA
biosynthetic inhibitor paclobutrazol inhibits ABI3 expression.
It appears that LEC1 and ABI3 could play a role in the crosstalk
between ABA and GA pathways, which supports the known
influence of these genes in these pathways.
3.5.2
MapMan
One of the most widely used software for pathway visualization is
MapMan [16]. This software classifies genes and metabolites in
ontologies based on metabolic pathway, cellular function, biological response, and gene families. The main advantage is that the user
can download the software and work offline. Also the databases
associated with MapMan are well annotated and are easily downloadable in a format that is useful for bioinformaticians.
1. Go to http://mapman.gabipd.org/web/guest/mapmandownload and download the latest version of MapMan
( see Note 20). Open MapMan.
2. Once open, the software shows the “get started” window that
will help us on the tool use. Basically, MapMan works by
combining a data file (experimental results) with diagrams
(pathways or chromosomal views) and mapping information.
Every file is stored in a specific folder (left side of the program).
Before starting the analysis, it is worth exploring the files available in MapMan (pathways and mapping files). To download
more pathways or mapping files from the MapManStore
server, click “File,” “Add pathway” or “Add mapping,” click
“Download,” and choose a pathway/map from the list, i.e.,
download the last gene TAIR annotation.
3. Upload your data; go to “File,” and “Add data.” Data must be
in .xls or a tab-delimited .txt file; first column should contain
the AGI ID or Affy ID numbers and the second column with
the expression values. The data will be stored into the
“Experiments” folder. We will use the genes upregulated in the
LEC1-OX plants present in [22].
4. For visualization of the data, choose a pathway from the left
and double click, i.e., “Regulation overview.” Choose a mapping according to the data. If the data contain AGI IDs use
Ath_AGI_TAIR, and if they contain Affy IDs, use Ath_AFFY_
TAIR. For LEC1OX genes, click on the data file uploaded in
step 3.
5. MapMan shows a representation of the pathways and genes
showing altered regulation (see Fig. 12). Each gene is symbolized by a square and expression is color encoded (by default
red denotes downregulated, blue denotes upregulated). As we
are looking at overexpressed genes in the LEC1-OX, we only
see blue colors. We can see that LEC1 overexpression promotes
Bioinformatic Tools in Arabidopsis Research
121
Fig. 12 Output of a MapMan pathway analysis using genes upregulated in LEC1-OX plants
the expression of transcription factors, genes involved in protein
modification and degradation. Looking at hormone pathways,
we can see that LEC1 promotes the expression of genes
involved in auxin, brassinosteroid, and gibberellin metabolism.
Below the pathway representation, there is information about
the statistical enrichment (using the Wilcoxon rank-sum test)
performed in MapMan. Mouse over gene squares to see information about gene function, name, and expression value. More
information about how to use MapMan with experimental data
is provided in an online tutorial on the MapMan site.
3.6 Protein
Information
3.6.1 SUBA III
The subcellular location database for Arabidopsis proteins [17] at
http://suba.plantenergy.uwa.edu.au/ is a comprehensive resource
encompassing experimental (“direct assay”) data from more than
1,000 publications, in which 4,110 entries comprising 2,647 distinct proteins are based on chimeric fusion studies, and 2,4142
entries comprising 7,893 distinct proteins are based on subcellular
proteomic studies. In addition, subcellular localization predictions
generated by 25 algorithms are also provided. It is possible to specify what you would like to retrieve from the SUBA database on the
122
Miguel de Lucas et al.
Fig. 13 SUBA III input page showing various options
input page. Alternately, one can query in a general manner, either
for a single gene or for a list of genes as follows:
1. Go to http://suba.plantenergy.uwa.edu.au/. Click on the
“Search” tab.
2. In the input box at the bottom of the input page, enter your
AGI ID of interest. In this case we will enter ABI3’s AGI ID,
At3g24650. Click the “Add” button to “Arabidopsis Gene
Identifier” and ensure that the pull-down list is selected as “is in
list” to generate a SUBA query. The Gene Identifier will now
appear in the Query box at the bottom of the page (see Fig. 13).
Alternately, use the Quick Search function on the SUBA Home
page.
3. Click Query.
4. A Results page containing a list of the genes in your input will
be generated. Click on the desired AGI ID to see the data for
this gene product, in this case the At3g24650.1 link—the
resulting page is called SUBAIII flatfile for At3g24650.1.
5. On the flatfile page for the desired AGI ID, here At3g24650.1,
we see that there is no MS/MS or GFP data for ABI3’s subcellular localization but that SwissProt reports that it is in the
nucleus. Similarly, 10 of the twenty five prediction programs,
SubLoc [40] and WoLF PSORT [41], both predict it to be
located in the nucleus. We can also see the predicted hydropathy
plots for the protein, along with other data. Given that ABI3 is
a transcription factor, we expect it to be located in the nucleus.
However, for proteins with unknown functions, it might be
useful to have a prediction or exact data regarding where it
might be located in the cell in order to predict function.
Bioinformatic Tools in Arabidopsis Research
123
Pictograph shows subcellular
compartments. Locations that are
documented or predicted are
colored depending on confidence
of localization in a given
compartment (red = highest confidence)
Data Source Options allows
predicted locations to be masked
Link to SUBA allows easy
access to data used by Cell eFP
Browser
Fig. 14 Cell eFP output page for At3g24650, ABI3
3.6.2 Cell eFP Browser
Cell eFP Browser: Data from SUBA III can be rendered onto a
pictograph of the parts of the cell using the Bio-Analytic Resource’s
Cell eFP Browser [7]. The Cell eFP Browser taps directly into the
SUBA III database and uses a simple heuristic algorithm that
weighs “direct assay” subcellular localization data higher than prediction programs to provide a visual representation of where the
protein is localized within the cell.
1. Go to http://bar.utoronto.ca/cell_efp/cgi-bin/cell_efp.cgi.
2. Enter the AGI ID for a gene of interest, for example, for ABI3
(At3g24650).
3. Click Lookup.
4. On the output page a pictograph will be displayed showing the
localization of the protein (see Fig. 14). A stronger red color
denotes that several direct assays have documented the protein
being at a particular location. Predictions receive a weighting
only one-fifth of that for direct assays.
5. It is possible to adjust the data sources used for display by using
the boxes on the right side of the Cell eFP output.
3.7 Protein–Protein
Interaction Networks
There are several databases to explore for Arabidopsis protein–protein interactions, notably the BAR’s Arabidopsis Interactions
Viewer (AIV) and TAIR’s NBrowse, described below. However, it
is advisable to examine other databases, such as IntAct (not specific
for Arabidopsis) at http://www.ebi.ac.uk/intact/ [42], BioGRID
(thebiogrid.org) [43], or AtPID (http://www.megabionet.org/
atpid/webfile/) [44], as literature curation efforts are by no means
complete for any of these databases.
124
Miguel de Lucas et al.
3.7.1 Arabidopsis
Interactions Viewer
The BAR’s Arabidopsis Interactions Viewer at http://bar.utoronto.ca/interactions/ [18] currently permits the exploration of
70,944 predicted and 28,505 experimentally determined protein–
protein interactions curated by BIND, the BAR, IntAct, TAIR,
etc. One may submit a list of gene (product) identifiers and the
AIV will return the interactors of the proteins. It is possible to
return only experimentally documented interactions or all interactions including those predicted through the use of the interolog
method (interacting ortholog) [18]. Attractive features of the AIV
include the ability to upload Cytoscape files (.cys files) as well as
the ability to color nodes by their expression level in different
tissues to help define subnetworks in different tissue types.
1. Go to http://bar.utoronto.ca/interactions/.
2. Enter an AGI identifier, or a list of identifiers, and select any of
the options you wish. The default setting will return all experimentally determined and predicted interactions for your gene
products of interest. For this example we will not check any of
the additional options, and we’ll again use ABI3, At3g24650 to
search for proteins with which it interacts.
3. Click Submit.
4. On the output page, a network graph of ABI3 interactors
appears, plus a legend, some further options, and a table of
these interactors at the bottom of the page (see Fig. 15).
5. In the network graph, the smaller nodes represent the proteins
that interact with ABI3, and the edges denote the interactions
between the proteins. Node color indicates protein subcellular
localization. Edges colored in light blue indicate interactions for
which experimental evidence was obtained. We see that ABI3
interacts with ATSYP23 (At4g17730) and ABI5 (At2g36270),
as determined experimentally in both cases by yeast two-hybrid
assays [45, 46]—clicking on the links in the BIND/PubMed
column takes one to the published reference for a given interaction (see Note 21). These edges are colored light blue. In the
case of the interaction with ATSYP23, this was determined by
an experimental screen, so it may represent a worthwhile candidate for further investigation as it was not followed up on in
that publication in any great detail. Two other interactions,
with ATBZIP10 (At4g02640) and ATBZIP25 (At3g54620),
are predicted by the interolog method [18], and thus, the edges
are colored grey. These represent other potential candidates for
follow-up investigation, whereby it should be noted that the
level of support for all these predictions is low, with just a CV
(confidence value) of 1. See ref. 18 for further information on
the calculation of CV and coexpression scores—basically all of
the AtGenExpress data sets were used for the coexpression calculations in the AIV, about 1,000 data sets in total, similar to
ATTED II’s condition-independent calculation.
Bioinformatic Tools in Arabidopsis Research
125
Edges are colored by coexpression score or experimental
support, and vary in width
depending on interolog support
Mouse-over nodes to see
protein annotation and other
information
Gene Expression Option allows
nodes to be colored with
expression data from gene
expression compendia instead of
by subcellular localization
Table shows protein-protein
interaction information including
coexpression scores and
subcellular localization, with links
to publication details
Fig. 15 Output page of an Arabidopsis Interactions Viewer query with At3g24650, ABI3
6. The default output is for the nodes to be colored according to
their subcellular localization as documented in the SUBA III
database (see above). A useful feature is to color nodes according to their expression levels in a given tissue. Clicking the
Show Gene Expression Options box on the left-hand side of the
output screen under the Download to Excel button calls up two
drop downs, one for Data Source and one for Tissue/Condition.
The Data Source option allows you to explore different compendia (the same ones as visible in the various eFP Browser
views described earlier), while the Tissue/Condition allows
you to choose which tissue or condition within a given compendia you are interested in using to retrieve expression level
data for painting onto the nodes. In this case, we will examine
the expression levels in Seeds Stage 10 w/o Siliques in the
Developmental_Map data source by selecting these and clicking
Show expression view. These data are mostly from Schmid et al.
[23]. In this case, we see that ABI3 and ABI5 (but not the
other interactors) are both strongly expressed in the seeds at
later stages of development, consistent with their known biological roles. It is possible to explore the expression levels for
the corresponding nodes (genes) by selecting different data sets
and tissues/conditions to permit you to identify other tissues
in which other nodes are more strongly expressed (e.g., Tissue_
specific/Guard Cells no ABA). However, in all data sources and
tissue/conditions queried, there are no conditions where ABI3
126
Miguel de Lucas et al.
Edge Filter Panel permits
filtering of edges
(interactions) by method
used to detect interaction,
e.g. confocal imaging.
Information Panel
show details on
select node or edge,
including links to
PubMed.
Main Output Panel shows
interactions for requested
gene. Nodes are grey, while
edges are coloured by
interaction detection method.
Fig. 16 Output page of an NBrowse query with At3g24650
and these other proteins are highly coexpressed. This indicates
that these interactions likely do not occur in planta, at least
insofar as can be determined from existing data sets.
3.7.2 NBrowse
NBrowse: TAIR’s NBrowse permits the exploration of 8,628
experimentally determined interactions curated by TAIR,
BioGRID, IntAct, and others. It offers the ability to specify the
type of experimental method for determining a given protein–
protein interaction.
1. Go to http://www.arabidopsis.org/tools/nbrowse.jsp and
enter your protein of interest’s AGI ID (e.g., At3g24650) or
symbol (ABI3), check the Launch with query checkbox, and
click Launch.
2. A Java applet will be started on your computer. The output is
shown in Fig. 16.
3. It is possible to filter the interactions (edges) by the type of
method used to determine a given protein–protein interaction
using the Edge Filter Panel.
4. Clicking on a specific node (protein) or edge (interaction) will
cause information on that protein or interaction to be shown
Bioinformatic Tools in Arabidopsis Research
127
in the Information Panel, including links to the PubMed
reference for the given interaction.
5. It is also possible to upload your own interaction data according to the format they specify (see their help file) and explore
them in the context of other documented interactions in the
NBrowse database.
3.8
Integrated Tools
3.8.1 VirtualPlant
Integrated tools associate data from multiple heterogeneous
sources of genomic data to obtain more accurate predictions.
Most of the bioinformatic tools described in this section integrate protein and genetic interactions, pathways, coexpression,
co-localization, and protein domain similarity and allow the user to
generate hypotheses in a rapid and facile manner.
VirtualPlant [19] integrates genomic data from different sources
(see Note 22) and provides a set of tools to visualize and analyze
these data. One extremely useful attribute of VirtualPlant is that
data and analyses can be stored on the website.
1. Go to http://virtualplant.bio.nyu.edu/cgi-bin/vpweb/. If you
wish to store your data, click on “Login” to register. The darkblue navigation bar at the top of the page contains the different
VirtualPlant tools.
2. Click on “Query.” To perform a query, select an option on
the type list (i.e., genes) and add a keyword (e.g., ABI3).
The results are displayed in a table; click on the gene that best
matches your query (i.e., ABI3, At3g24650). VirtualPlant
shows all the information available on the server about our
query, including annotation, gene models, and external links.
For additional data click on the “Gene Family” folder to see
more members of the ABI3/VP1 transcription factor family
(the ABI3/VP1 family has 11 members).
3. To analyze a list of genes, data must be uploaded. The user can
upload a list of genes or microarray experiments. One useful
feature of VirtualPlant is that for microarray analysis, .CEL
files can be uploaded and normalized (GCRMA or MAS5
methods) using VirtualPlant.
4. In the dark-blue toolbar click on “Upload Data” followed by
“Click here to upload one or more list of genes” and paste
your list of genes or upload a file following the format described
at the top of the page, or paste your list of genes, i.e., paste the
AGI IDs from the list of genes upregulated on LEC1-OX
plants [22]. Click “Submit.” Our list of genes is now uploaded
in the “My Genes” folder. Click “Analyze” in the navigation bar.
The analysis window shows the gene sets in our folder—select
our data set (see Fig. 17). On the “Analysis” menu, select the
experiment you want to perform. One of the most beautiful
128
Miguel de Lucas et al.
Your cart. Your data
sets and the files
generated during the
analysis will be
stored here.
Navigation bar. Use
this bar to upload
your data sets and
start with the
analysis.
Analysis
window. Select
a list of genes
and an analysis
tool.
List of
Analysis
Functions.
Fig. 17 VirtualPlant workspace
Fig. 18 A snapshot of the Cytoscape graph output from VirtualPlant. Metabolic interactions (blue edges) from
KEGG or AraCyc as they are determined by regulation of genes overexpressed in a LEC1-OX line are visualized
analysis tools available on VirtualPlant is the “Network
Analysis” tool. Here, with a list that is available one can select from
a variety of interactions including validated TF-target, microRNAtarget mRNA, and metabolic and pathway interactions from
KEGG and AraCyc. An independent Cytoscape browser
(“VirtualPlant meets Cytoscape”) is launched (see Fig. 18).
One can explore the different interactions by coloring the
Bioinformatic Tools in Arabidopsis Research
129
edges with different colors in Cytoscape via the VizMapper
tool. In this case we can determine that the majority of genes
overexpressed within LEC1 are metabolic in nature.
5. VirtualPlant also allows the analysis of multiple gene lists at the
same time. We may be interested in finding common genes
between the two experiments. In our case we would like to
determine if there are any genes that are upregulated when
LEC1 is overexpressed and that are coexpressed with ABI3.
This would identify that LEC1 is sufficient to regulate these
genes which also may share functionality with ABI3. We will
additionally upload the list of the Top 50 ABI3 developmental
coexpressed genes (see Table 2) to explore this functionality.
Click “Analyze.” Select both lists of genes (LEC1-OX upregulated genes and Top 50 ABI3 developmentally coexpressed
genes (see Table 2)). On the “Analysis menu,” select “intersect.” VirtualPlant will generate a new file in “My Genes”
folder with the common genes between both data sets. This
file can be used for further analysis.
3.8.2 GeneMania
The GeneMania [20] algorithm uses a Cytoscape plugin to integrate protein and genetic interaction data, coexpression, and colocalization information. We can use GeneMania to predict the
function of a single gene or to find new members of a pathway or
a protein complex. In this tutorial we will explore the relationship
between PIF1 and ABI3.
1. Go to http://www.genemania.org/.
2. GeneMania integrates data from seven different organisms.
Next to “Find Genes in”, select Arabidopsis thaliana and add
your gene or list of genes into next to “related to.” GeneMania
recognizes gene names and AGI IDs, but not Affy IDs. If
GeneMania does not recognize your query, it will tell you with
a yellow speech bubble. We will add ABI3 (At3g24650) and
PIF1 (At2g20180) in the second window (one gene per line)
to try to predict a mechanism for why ABI3 expression is
downregulated in pif1pif3pif4pif5.
3. On the left part of the window, a network graph visualized
using Cytoscape is displayed with colored edges to indicate
different interaction types between different genes. Brown
indicates predicted interactions, grey indicates coexpression,
dark blue indicates physical interactions, light blue indicates
co-localization, and green indicates genetic interactions. On
the right side of the window, there are four tabs. The “network”
tab gives the option to select the type of interactions we want
to see on the right diagram, e.g., we can check the physical
interactions tab only. It looks like PIL5 could form a protein
complex with ABI3 and At5g61380. There are many examples
130
Miguel de Lucas et al.
Choose your
organism here.
Interactive Network
visualization. User can
modify network visualization
and get gene information just
with playing with the mouse.
Write your gene
names here.
Fig. 19 GeneMania output
by which protein complexes can have autoregulatory function
on one or more of the members of the protein complex [47].
By clicking on the nodes that represent the genes, we get more
details regarding gene function. For instance, At5g61380 is a
two-component response regulator and possesses transcription
regulatory activity. The “gene” tab gives a list of interactors
with our query proteins, e.g., the DELLA protein interacts
with PIF1. It has been described that DELLAs repress PIF
activity and that they accumulate in the absence of GA [48,
49]. This could potentially be the mechanism by which negative crosstalk exists between ABA and GA. The “Functions”
tab shows the GO annotation of the genes in the network. We
can sort the list by GO annotation name by the False Discovery
Rate of by Coverage (number of genes in the network with a
given function divided by all the genes in the genome with that
function) (see Fig. 19).
4. Above, the network diagram there is a bar with more options
to save the data or to play with network graph visualization.
3.8.3
ePlant
The easy-to-use ePlant website [21] integrates six essential tools
for plant biology research. With only a few mouse clicks the user
can find homologous genes and polymorphisms, visualize gene
expression in the whole plant and/or in different tissues, determine
the subcellular localization of a protein, find its interactors, and
Bioinformatic Tools in Arabidopsis Research
Insert AGI ID into
this box to start
working with ePlant.
Visualization Tools.
Manipulate the controls
here to zoom in and out,
rotate and change position.
3D representation of
Arabidopsis. Plant
parts where the gene is
expressed are in red.
131
Use this button to
download numeric data.
Fig. 20 ePlant output for ABI3
predict protein structure. However, the user can only investigate
one gene at a time.
1. Go to bar.utoronto.ca/eplant/.
2. Type the AGI ID of your gene of interest next to the AGI ID
box, at the top of the page, i.e., At3g24650 for ABI3.
3. Click on “Homologs and polymorphisms.” ePlant displays
homologous genes for our query gene. Homologs are computed
by using OrthoMCL. The amino acid sequences of the homologous proteins are aligned and represented in an interactive view
that provides information of conserved residues, amino acid
physiochemical properties, and single nucleotide polymorphisms. In the case of ABI3 there are no homologous genes,
and there is one synonymous polymorphism, at least in the
Nordborg et al. data set [50] that ePlant currently uses.
4. Click on “Plant expression,” “Tissue expression,” or
“Subcellular location” to explore expression levels in the whole
plant, in a specific tissue or developmental stage, or to determine where the protein is localized into the cell. For each analysis, ePlant uses a three-dimensional drawing that represents
the Arabidopsis plant, different plant tissues, or a plant cell
(Fig. 20). Expression levels are represented from yellow (low)
132
Miguel de Lucas et al.
to red (high) in each drawing (see Note 23). On the left part
of each page, there are tools to manipulate the visualization.
The user can zoom in, zoom out, rotate the figure, and change
its position on the three-dimensional axes. Click on the “sample list” at the right part of the page, to localize the different
parts represented on each drawing. On the “Plant expression”
and “Tissue expression,” under “sample list” buttons to change
from “absolute” to “relative” expression levels. Click on
“Retrieve signal data” to get and/or download the numerical
gene expression information. ABI3 is highly expressed on seed
siliques, and it is not expressed in the root, leaves, stem, or
flowers (see Note 24). At the developmental and tissue-specific
level, ABI3 is expressed in dry and imbibed seeds. In the subcellular location tool, ABI3 is predicted at low confidence to
reside within the nucleus.
5. Click on “interactors” to view interactors with our gene. ePlant
uses the BAR’s AIV database to generate and graph a network
with edges and nodes. On the right side of the page, there is a
menu to work on the network properties, i.e., we can filter the
interactors (called neighbors) according to the confidence
value of the edges (CV). With a CV ≥ 2, there are two ABI3
neighbors, At4g02640 (At_bZIP10) and At2g36270 (ABI5).
We also can size the neighbors by coexpression values, as well
as represent its subcellular localization with different colors.
Right click on the network for visualization options.
6. Click on “protein model” to view a 3D structure of your protein. The next page provides a list of predicted models for our
protein. Choose the one with the lower e-value. ePlant shows
a 3D model from the Protein Data Bank or predicted by Phyre
(see Note 25; Protein HomologY/analogy Recognition
Engine). The options on the right of the page allow the user to
highlight in red the polar and charged residues or to draw the
protein surface. Below the options menu, ePlant represents the
alignment between the sequence used for the 3D model and
the query protein, i.e., the ABI3 3D model represents amino
acids 566–678 of the protein. Right click on the model for
visualization options.
4
Notes
1. Different microarray platforms are able to detect varying numbers of transcripts. The ATH1 array from Affymetrix has probe
sets for 22,814 transcripts, some of which may come from
several genes. Other microarray platforms or next generation
sequencing technologies are more comprehensive, e.g.,
Arabidopsis Whole-Genome Tiling Array 1.0 or RNA-seq.
Bioinformatic Tools in Arabidopsis Research
133
2. The Arabidopsis Genome Initiative identifier, AGI ID, is easily
found at TAIR; see Chapter 4.
3. It is useful to set the signal threshold to some value when comparing different genes or viewing a number of different data
sources. That way, the expression level that “red” denotes is
constant. The expression level distribution graph is also a handy
feature for determining if one’s gene of interest is a strong
expressor. The small graph shows the distribution of the average
expression level of all genes in the tissues depicted on the
output, while the red line shows where the maximum expression level of the gene of interest falls along that distribution.
4. The Bio-Analytic Resource does provide a bulk query tool
called “Expression Browser” which provides a Genevestigatorlike ability to query many genes at one; see http://bar.utoronto.ca/affydb/cgi-bin/affy_db_exprss_browser_in.cgi.
5. Genevestigator has no control over experimental design, and
only a post-analysis is possible to check the quality of the array.
For more information about quality control criteria visit https://
www.genevestigator.com/userdocs/manual/qc.html.
6. On the open access version one can only analyze a maximum
of 50 genes simultaneously. To analyze more than 50 genes we
can create a Genevestigator account.
7. For experimental normalization,
Bioconductor’s RMA implementation.
Genevestigator
uses
8. A p-value under 0.06 indicates that the signal is reliably
detected.
9. It is often useful to examine condition-dependent data sets, as
genes may respond one way in a set of tissues and in an opposite
way in others. If one lumps these sets together, then these
correlations cannot be detected. This issue is described in greater
detail in the Usadel et al. review [6].
10. Given the number of samples in most of these data sets, even a
Pearson correlation coefficient of 0.3 can be considered “significant.” But with this r-value, only (0.3)2 = 9 % of the variance
is shared between two genes. An r-value of 0.7 means that
coexpression explains 49 % of the variance in common between
two genes. This is the reason why 0.7–0.75 is often used as a
cutoff for coexpression analysis.
11. ATTED II uses the MR (Mutual Rank) value to rank the
coexpressed genes; lower MR values means more correlation.
This method was determined by the authors to have higher
performance in the prediction of gene function than the
Pearson correlation coefficient (PCC).
12. PLACE (Plant Cis-acting Regulatory DNA Elements) http://
www.dna.affrc.go.jp/PLACE/. This database has not been
updated since 2006.
134
Miguel de Lucas et al.
13. Athena has not been updated since 2005. The TAIR gene
annotations and cis-elements are not the most recent, but it is
still a useful site.
14. The user can analyze of up to 100 genes at once on compact
visualization type.
15. The ABRE binding site motif has the consensus motif (C/T)
ACGTGGC, and it is known that the ABF (ABA-responsive
element binding factor) family of transcription factors bind to
that motif [51].
16. GOrilla is another useful tool for such analyses and permits the
ability to upload a ranked list of genes for enrichment analysis.
It offers similar visualization of enriched categories. See
http://cbl-gorilla.cs.technion.ac.il/ [52].
17. Note that it is not possible to select a background data set for
Classification SuperViewer. This is not so much of an issue for
gene lists that are derived from relatively comprehensive
platforms but can be an issue for platforms that are less
comprehensive.
18. AraCyc is a part of the BioCyc metabolic databases. All the
metabolic databases present on BioCyc share the same software,
so the tutorial described on this section can be applied on the
other databases.
19. We can include more expression columns, each one could
represent a different experiment or time point.
20. The MapMan version used in this tutorial is Subheading 3.1.1.
21. For BIND links it will be necessary to obtain a user account
with the BIND/BOND website to view the literature record.
22. VirtualPlant integrates information from Arabidopsis and
rice sources.
23. In the case of “subcellular location,” information comes from
SUBA database. The red color represents the protein
localization.
24. Note that this expression data represents the whole organ for
roots and not the cell type-specific expression described in the
eFP browser (Subheading 3.1.1).
25. Protein Data Bank: http://www.rcsb.org/pdb/home/home.
do. Phyre website: http://www.sbg.bio.ic.ac.uk/~phyre/.
References
1. Chory J et al (2000) National Science
Foundation-sponsored workshop report:
“The 2010 Project” functional genomics and
the virtual plant. A blueprint for understanding
how plants are built and how to improve them.
Plant Physiol 123:423–426
2. Alonso JM et al (2003) Genome-wide insertional mutagenesis of Arabidopsis thaliana.
Science 301:653–657
3. Rhee S et al (2003) The Arabidopsis
Information Resource (TAIR): a model organism database providing a centralized, curated
Bioinformatic Tools in Arabidopsis Research
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
gateway to Arabidopsis biology, research materials and community. Nucleic Acids Res 31:224
Finkelstein RR, Somerville CR (1990) Three
classes of abscisic acid (ABA)—insensitive
mutations of arabidopsis define genes that control overlapping subsets of ABA responses.
Plant Physiol 94:1172
Brady S, Provart N (2009) Web-queryable
large-scale data sets for hypothesis generation
in plant biology. Plant Cell 21:1034
Usadel B et al (2009) Co-expression tools for
plant biology: opportunities for hypothesis
generation and caveats. Plant Cell Environ
32:1633–1651
Winter D et al (2007) An ‘Electronic
Fluorescent Pictograph’ browser for exploring
and analyzing large-scale biological data sets.
PLoS One 2:e718
Hruz T et al (2008) Genevestigator V3: a reference expression database for the metaanalysis of transcriptomes. Adv Bioinformatics
420747
O’Connor TR, Dyreson C, Wyrick JJ (2005)
Athena: a resource for rapid visualization and
systematic analysis of Arabidopsis promoter
sequences. Bioinformatics 21:4411–4413
Obayashi T et al (2011) ATTED-II updates:
condition-specific gene coexpression to extend
coexpression analyses and applications to a
broad range of flowering plants. Plant Cell
Physiol 52:213–219
Toufighi K et al (2005) The botany array
resource: e-Northerns, expression angling, and
promoter analyses. Plant J 43:153–163
Du Z et al (2010) agriGO: a GO analysis toolkit for the agricultural community. Nucleic
Acids Res 38:W64–W70
Carbon S et al (2009) AmiGO: online access to
ontology and annotation data. Bioinformatics
25:288–289
Provart N, Zhu T (2003) A browser-based
functional classification SuperViewer for
Arabidopsis genomics. Curr Comput Mol Biol
2003:271–272
Mueller LA, Zhang P, Rhee SY (2003) AraCyc:
a biochemical pathway database for Arabidopsis.
Plant Physiol 132:453–460
Thimm O et al (2004) Mapman: a user-driven
tool to display genomics data sets onto diagrams of metabolic pathways and other biological processes. Plant J 37:914–939
Heazlewood JL et al (2007) SUBA: the
Arabidopsis subcellular database. Nucleic Acids
Res 35:D213–D218
Geisler-Lee J et al (2007) A predicted interactome for Arabidopsis. Plant Physiol 145(2):
317–329
135
19. Katari MS et al (2010) VirtualPlant: a software
platform to support systems biology research.
Plant Physiol 152:500–515
20. Mostafavi S et al (2008) GeneMANIA: a realtime multiple association network integration
algorithm for predicting gene function.
Genome Biol 9(Suppl 1):S4
21. Fucile G et al (2011) ePlant and the 3D data
display initiative: integrative systems biology
on the World Wide Web. PLoS One 6:e15237
22. Mu J et al (2008) LEAFY COTYLEDON1 is a
key regulator of fatty acid biosynthesis in
Arabidopsis. Plant Physiol 148:1042–1054
23. Schmid M et al (2005) A gene expression map
of Arabidopsis thaliana development. Nat
Genet 37:501–506
24. Nakabayashi K et al (2005) Genome wide profiling of stored mRNA in Arabidopsis thaliana seed
germination: epigenetic and genetic regulation of
transcription in seed. Plant J 41:697–709
25. Brady SM et al (2003) The ABSCISIC ACID
INSENSITIVE 3 (ABI3) gene is modulated by
farnesylation and is involved in auxin signaling
and lateral root development in Arabidopsis.
Plant J 34:67–75
26. Laubinger S et al (2008) At-TAX: a whole
genome tiling array resource for developmental
expression analysis and transcript identification
in Arabidopsis thaliana. Genome Biol 9:R112
27. Zeller G et al (2009) Stress-induced changes in
the Arabidopsis thaliana transcriptome analyzed using whole-genome tiling arrays. Plant J
58:1068–1082
28. Brady SM et al (2007) A high-resolution root
spatiotemporal map reveals dominant expression patterns. Science 318:801–806
29. Obayashi T, Kinoshita K (2009) Rank of correlation coefficient as a comparable measure for
biological significance of gene coexpression.
DNA Res 16:249–260
30. Dubreucq B et al (2000) The Arabidopsis
AtEPR1 extensin-like gene is specifically
expressed in endosperm during seed germination. Plant J 23:643–652
31. Nole-Wilson S, Tranby TL, Krizek BA (2005)
AINTEGUMENTA-like (AIL) genes are
expressed in young tissues and may specify
meristematic or division-competent states.
Plant Mol Biol 57:613–628
32. Chattopadhyay S et al (1998) Arabidopsis
bZIP protein HY5 directly interacts with lightresponsive promoters in mediating light control of gene expression. The Plant Cell Online
10:673–684
33. Higo K et al (1999) Plant cis-acting regulatory
DNA elements (PLACE) database: 1999.
Nucleic Acids Res 27:297–300
136
Miguel de Lucas et al.
34. Liu B, Chen J, Shen B (2011) Genome-wide
analysis of the transcription factor binding preference of human bi-directional promoters and
functional annotation of related gene pairs.
BMC Syst Biol 5:S2
35. Ouyang X et al (2011) Genome-wide binding
site analysis of FAR-RED ELONGATED
HYPOCOTYL3 reveals its novel function in
Arabidopsis development. The Plant Cell
Online 23:2514–2535
36. Zhang H et al (2011) Genome-wide mapping
of the HY5-mediated gene networks in
Arabidopsis that involve both transcriptional
and post-transcriptional regulation. Plant J 65:
346–358
37. Razem FA et al (2006) The RNA-binding protein FCA is an abscisic acid receptor. Nature
439:290–294
38. Ashburner M et al (2000) Gene ontology: tool
for the unification of biology. Nat Genet
25:25–29
39. Baud S et al (2002) An integrated overview of
seed development in Arabidopsis thaliana ecotype WS. Plant Physiol Biochem 40:151–160
40. Hua S, Sun Z (2001) Support vector machine
approach for protein subcellular localization
prediction. Bioinformatics 17:721–728
41. Horton P et al (2007) WoLF PSORT: protein
localization predictor. Nucleic Acids Res 35:
W585–W587
42. Aranda B et al (2009) The IntAct molecular
interaction database in 2010. Nucleic Acids
Res 38:D525–D531
43. Stark C et al (2011) The BioGRID Interaction
Database: 2011 update. Nucleic Acids Res
39:D698–D704
44. Li P et al (2011) AtPID: the overall hierarchical
functional protein interaction network interface and analytic platform for Arabidopsis.
Nucleic Acids Res 39:D1130–D1133
45. Klopffleisch K et al (2011) Arabidopsis
G-protein interactome reveals connections to
cell wall carbohydrates and morphogenesis.
Mol Syst Biol 7
46. Nakamura S, Lynch TJ, Finkelstein RR (2001)
Physical interactions between ABA response
loci of Arabidopsis. Plant J 26:627–635
47. Cui H et al (2007) An evolutionarily conserved
mechanism delimiting SHR movement defines
a single layer of endodermis in plants. Science
316:421–425
48. De Lucas M et al (2008) A molecular framework for light and gibberellin control of cell
elongation. Nature 451:480–484
49. Dill A, Jung HS, Sun T (2001) The DELLA
motif is essential for gibberellin-induced degradation of RGA. Proc Natl Acad Sci 98:14162
50. Nordborg M et al (2005) The pattern of polymorphism in Arabidopsis thaliana. PLoS Biol
3:e196
51. Choi H (2000) ABFs, a family of ABAresponsive element binding factors. J Biol
Chem 275:1723–1730
52. Eden E et al (2009) GOrilla: a tool for discovery and visualization of enriched GO terms in
ranked gene lists. BMC Bioinformatics 10:48
Part III
Genetic Techniques
Chapter 6
Exploiting Natural Variation in Arabidopsis
Johanna A. Molenaar and Joost J.B. Keurentjes
Abstract
Natural variation for many traits is present within the species Arabidopsis thaliana. This chapter describes
the use of natural variation to elucidate genes underlying the regulation of quantitative traits. It deals with
the development and use of mapping populations, the detection and handling of genetic markers, the
phenotyping of quantitative traits, and, finally, QTL analyses. The focus of the chapter is on the use and
development of recombinant inbred lines, but other types of segregating populations, including genomewide association mapping in natural populations, are also discussed.
Key words Natural variation, Quantitative trait, QTL mapping, Recombinant inbred lines, Genomewide association mapping
1
Introduction
For many properties of plants, natural variation exists between and
within species. Natural variation is defined as genome-encoded differences causal for phenotypic variation and is regarded as a major
driving force in adaptation and species formation. In addition, the
acknowledgement of heritable variation in specific traits has greatly
contributed to agricultural crop improvement. Ever since the
domestication of wild species, some 10,000 years ago, farmers have
sought for optimal crop varieties to grow. Initially, natural varieties
of species were evaluated, and new crop varieties were developed
by stringent performance selection of founder lines used for breeding. This resulted in crops which were better adapted to local climates, more resistant to diseases, and yielding higher amounts of
harvestable product [1]. At the onset of the discovery of the structure of DNA, however, knowledge of the genome-encoded information increased exponentially over the last decades. This enabled
a shift from phenotypic towards genotypic selection methods,
greatly increasing pace and accuracy of modern breeding practices.
Crucially here is the identification of the relationship between genotype and phenotype, for which a number of methods have been
Jose J. Sanchez-Serrano and Julio Salinas (eds.), Arabidopsis Protocols, Methods in Molecular Biology, vol. 1062,
DOI 10.1007/978-1-62703-580-4_6, © Springer Science+Business Media New York 2014
139
140
Johanna A. Molenaar and Joost J.B. Keurentjes
developed. The notion that natural variation can be instrumental
in the identification of genetic regulation of quantitative traits has
also contributed substantially to our fundamental understanding of
key biological and evolutionary processes. Functional analysis of
natural variants enabled the detection of genetic factors controlling
essential steps in plant development and performance. For obvious
reasons, e.g., long generation times and complex genome structure, crop species are not ideal for the genetic and functional analysis of most traits. Since many traits are evolutionary conserved,
model species are nowadays widely used for the elucidation of the
mechanistic basis of plant biology [2].
Arabidopsis thaliana is perfectly suited as the reference species
in modern plant sciences. It combines rapid generation cycles with
high reproductive success and contains a small genome. Because it
is an autogamous species, homozygous inbred lines can be obtained
in which genotypes are fixed, allowing propagation and multiplication of isogenic lines. Nonetheless, it tolerates intraspecific crosses
yielding viable offspring with genomic and functional segregation
in subsequent generations. In addition, it allows interspecific crossing with some of its close relatives, although progeny of such combinations is often sterile. Importantly, Arabidopsis has a worldwide
geographic distribution covering a diversity of growing habitats
[3]. Adaptation of accessions to this variety of local environments
over the course of evolution has led to a wealth of natural variation
in many complex traits. These properties make Arabidopsis the
species of choice for genetic analyses of many life history traits.
Over the last decades, natural variation is exploited to elucidate the
genetics underlying both qualitative and quantitative traits [4].
Where the genetic analysis of qualitative traits is quite straightforward, it is much more complicated in quantitative traits. Qualitative
traits are typically regulated by a limited number of genes resulting
in discrete phenotypic classes that can easily be associated to
genomic regions using simple Mendelian genetics. Quantitative
traits, however, often show a continuous distribution of trait values
over different genotypes, which makes it difficult to assign phenotypes to distinct classes. The reason for this quantitative nature of
phenotypic expression is the involvement of a multitude of genes
each contributing moderate to small effects. Gene-by-gene (epistasis) and genotype-by-environment (GxE) interactions further
complicate the genetic regulation of quantitative traits. To account
for the complexity of the genetic architecture underlying quantitative traits, more sophisticated statistical analysis methods are
required for the identification of quantitative trait loci (QTLs).
QTLs are defined as genomic regions involved in the genetic regulation of a specific trait and in which allelic variation explains a
significant part of the phenotypic variation observed in this trait.
The detection of QTLs, known as genetic linkage mapping, is
based on the principal of linkage disequilibrium (LD). Basic
Exploiting Natural Variation
141
Mendelian genetics teaches that genetic factors in close vicinity of
each other, i.e., located on the same chromosomal arm, inherit
simultaneously. This linkage can only be broken by a recombination event during meiosis. The further apart two loci are on a chromosome, the larger the chance that a crossover occurs between
them. The relationship between recombination frequency and
genomic remoteness was first recognized by Thomas Hunt
Morgan, and hence, the genetic distance is expressed in centiMorgan (1 % rf = 1 cM). It can easily be deduced that if two loci inherit
independently, gametes have a 50 % chance of carrying a recombinant genotype. From this, it can be concluded that genetic distances above 50 cM cannot be discriminated from random
segregation of unlinked loci. Such loci are then referred to as being
in linkage equilibrium. If two loci are co-inherited to some extent
these are referred to as being in linkage disequilibrium.
Although the genetic distance is related to the physical distance, this relationship is not always linear. While the physical distance between two loci is determined by a fixed number of
nucleotides, the genetic distance is estimated by the number of
crossovers between them. Because the frequency of recombination
is dependent on a number of different factors, the cM to bp ratio
is not constant over the genome. The highly heterogeneous centromeric regions, for instance, are almost completely devoid of
crossovers resulting in large physical distances between genetically
closely linked loci. Fortunately, the gene density in heterochromatic regions is much lower than in euchromatin where this relationship is much tighter. Genetic maps can therefore be a good
proxy for the physical position of QTLs.
Linkage mapping detects associations between the phenotype
and the underlying genotype in an indirect way. Genotypic differences between accessions are determined by sequence polymorphisms that can serve as genetic markers to identify the parental
descent of genomic regions. Although most polymorphisms will be
functionally neutral, some of them might be close enough to be in
LD with the causal factor explaining a QTL. Genetic linkage mapping thus requires genome-wide coverage of markers that are statistically tested for association with variation in the trait of interest.
Any significantly associated markers (QTLs) are in LD with allelic
variation responsible for the observed phenotypes and hence hint
to the position of the causal gene.
To identify the genetic factors underlying quantitative traits,
large collections of individuals showing natural variation must be
analyzed. Such collections can consist of wild accessions, but
Arabidopsis QTL mapping is most powerful in experimental mapping populations that segregate for the trait of interest. Although
many different types exist, biparental populations descending from
a cross between two distinct accessions are most popular. Widely
used are recombinant inbred lines (RILs) derived from F2
142
Johanna A. Molenaar and Joost J.B. Keurentjes
individuals, which are the progeny of a hybrid of two distinct
homozygous accessions, by single seed descent. Many of such RIL
populations have been generated, and we will focus this chapter on
the development and analysis of these genetic resources. Because
RILs are inbred for several generations, they are homozygous and
can therefore be propagated and used indefinitely to study many
complex traits in various conditions. In addition, accurate phenotyping can be achieved efficiently because the genetic material
under investigation can be analyzed in isogenic replications.
For the sake of completeness, we will briefly discuss some of
the related alternatives of RIL populations without addressing
their generation and use in detail. A fast shortcut for the development of RIL populations is the analysis of F2 populations in which
large parts of the genome of its individuals will still be heterozygous. The advantages here are the fast generation and the possibility to determine dominance effects. However, the lower number of
recombination decreases mapping resolution, and the increased
complexity of heterozygosity reduces mapping power. So, to
achieve reasonable statistical power and resolution, much larger
population sizes are needed. The largest disadvantage, however, is
the further segregation of the heterozygous regions in subsequent
generations. Such populations can therefore not be maintained,
while an equal genotyping investment is needed for their analysis.
Moreover, experimental replications of genotypes are not available
since each individual has a unique genetic makeup. As a consequence, phenotyping and genotyping must be carried out on the
same plant. A second often used alternative for RIL populations
are near isogenic lines (NILs) or introgression lines (ILs). A NIL
has an identical genome (isogenic) to one of the parental lines (the
background line) except for a small region (an introgression) which
is derived from the donor parental line. NILs can be created from
an F1 via several rounds of backcrossing and selfing [5]. The generation of a set of NILs is very laborious but can be very useful,
because it allows studying only a single QTL at the time avoiding
complications of the segregation of multiple loci (e.g., epistasis).
Intensive marker evaluation over multiple generations is needed to
get a genome-wide coverage population. In Arabidopsis, two of
such populations have been developed [6, 7]. Because of inbreeding depression, NILs are often the only viable option for immortal
populations in many species. Lastly, doubled haploids (DHs) are
frequently used in many breeding crops to identify QTLs. Only
recently DH populations can be constructed for Arabidopsis. For
this, F1 plants are crossed to a genome elimination mutant line
which converts the recombinant F1 gametes to viable haploid
seeds. Incidentally, the resulting haploid plants spontaneously
undergo a whole-genome duplication yielding viable homozygous
diploid seeds [8]. This way an immortal homozygous population
reminiscent of RILs can be achieved in only three generations.
Although the resolution of DH populations is lower due to reduced
Exploiting Natural Variation
143
recombination frequencies compared to RILs, this type of population can become an important tool in the near future.
The two major disadvantages of biparental mapping populations are the poor mapping accuracy and the limited genetic variation present between only two accessions, leaving genetic variants
present in other genotypes of the species undetected. To overcome
some of these limitations, several advanced RIL populations are
developed for Arabidopsis in which multiple founder lines are used
to incorporate more natural variation in addition to several generations of intercrossing to improve the mapping accuracy [9–12].
Alternatively, more natural variation can be investigated by jointly
analyzing multiple populations of different types and origin [13–
15]. As noted above, a different approach to exploit the genetics
underlying natural variation is the use of collections of wild accessions in so-called genome-wide association studies (GWAS) [16].
Developed in human genetics, where the generation of experimental populations is for ethical reasons undesired, this method uses a
subset of natural accessions available and aims to associate trait differences with specific genotypes. As such the principles of GWA
mapping do not differ much from classical linkage analysis but due
to the fast LD decay in natural Arabidopsis populations, often
within 10 kbp, a much denser genotype map is required [17]. This
fast decay is the result of the high number of historical recombination events accumulated during the evolutionary history of the species. Consequently, significant associations have very small support
intervals, which simplifies the detection of the causal gene underlying the QTL enormously [18]. Although the allelic variation analyzed and the acquired resolution are much higher in GWA mapping
of natural populations, the statistical power is much lower than in
experimental populations. Correction for population structure giving rise to false negatives, the presence of multiple small-effect or
rare large-effect alleles, and the co-segregation of many QTLs are
only a few of the many confounding factors, and no consensus is
yet reached about the preferable statistical methods [19].
2
Materials
1. Seeds of Arabidopsis accessions and mapping populations.
(www.arabidopsis.org, ABRC stock center) (http://www.
inra.fr/internet/Produits/vast/RILs.htm) (see Note 1).
2. Equipment to cross plants (tweezers, stereo microscope, and
labels).
3. Facilities to grow many plants simultaneously, under the assay
conditions, which is necessary in order to perform the quantitative analysis of whole accession collections or mapping populations. Specific requirements will depend on the particular
test conditions.
144
Johanna A. Molenaar and Joost J.B. Keurentjes
4. Equipment to genotype molecular genetic markers. This
might be as simple as oligonucleotides and PCR consumables,
a thermocycler and an agarose gel system, for standard PCR
markers; or apparatus and reagents for high-throughput genotyping of polymorphisms such as microarray and sequence
technology.
5. Equipment to measure the quantitative trait(s) of interest.
Depending on the biological parameter to be measured, this
might be, for instance, from a simple ruler up to a luciferase
luminometer or a microarray scanner.
6. Software for general statistical analysis (e.g., SAS, SPSS, or
GENSTAT packages), for linkage mapping analysis (e.g.,
MAPMAKER or Joinmap), and for QTL analysis (e.g.,
MAPMAKER/QTL, Map Manager QTX, MapQTL,
MultiQTL, PlabQTL, QTL Cartographer, R-QTL, or QTL
express).
3
Methods
As explained in the Introduction, the ultimate goal of QTL analysis
is to elucidate the genes that are causal for a certain phenotype. To
achieve this goal, a number of steps have to be followed. In this
chapter, we describe the different steps to perform a QTL analysis.
It will deal with each appropriate handling in an orderly fashion. In
the first part, some basic knowledge required before starting a mapping experiment is discussed. We then discuss how to develop and
genotype a mapping population, and finally, the actual linkage
mapping will be explained. In addition, some theoretical/statistical
background will be given about data analysis and how to interpret
results. This section will focus on QTL analyses in RIL populations,
since this type of population is used most often, but the principles
are widely applicable to a range of different population types.
3.1 Natural Variation,
Heritability, and
Phenotyping Assays
Before performing a QTL analysis in Arabidopsis, a number of
things need to be considered. Importantly, natural heritable variation for the trait of interest should be present within the species,
and a phenotyping assay should be available to quantify the observed
variation. When these requirements have been met, a mapping
population segregating for the trait of interest should be available
or needs to be developed. To gain information about natural variation for the trait of interest, a selection of different natural accessions can be phenotyped. Such a selection ideally consists of the
most diverse accessions which can be determined by morphological
differences, geographical distribution, or genotypic information.
Many accessions show large differences in morphological properties which often have pleiotropic effects on many other traits. Much
Exploiting Natural Variation
145
of the morphological variation is the result of adaptation to local
environments, and therefore, selection for geographic origin can
increase the chance of detecting natural variation. The selection
pressure over time has been different between origins, and information about the (climate) conditions in the places of origin can be
helpful to select accessions that are expected to be different for the
trait of interest. Finally, for an increasing number of accessions
genotypic information is publicly available which can be mined to
select the genetically most diverse accessions (www.1001genomes.
org; [20]). In general, these selection criteria are highly related
since geographical distant accessions are often reproductively isolated leading to distinct genotypic profiles. Most accessions can be
retrieved from the stock centers (ABRC, NASC). It might be
worthwhile though to include parental accessions of existing RIL
populations in the initial screen. This has the advantage that a mapping population is already available when phenotypic variation is
detected. A list of existing RIL populations is available at http://
www.inra.fr/internet/Produits/vast/RILs.htm.
To reliably estimate accurate phenotypic trait values for each
selected accession, replicate measurements on different individuals
need to be performed. The number of replicates depends on the
robustness of the trait but a minimum of five is advisable. The variance estimates of these initial experiments are informative for the
sources of variation and the inheritance of traits. An important part
of the total detected variation is non-genetic residual variation
which can be broken down in technical and biological variation.
Technical variation includes sample treatment and measurement
error which can be estimated and annulled by replicate analytical
measurements of the same sample or individual. Biological variation, however, is defined as the variation observed between replicate individuals of the same genotype and is often the result of the
interaction with the environment. Small local differences during
seedling establishment or due to positional placement in the growing facility can strongly enhance phenotypic differences, and uniform growing conditions are therefore recommended. Residual
variation is random and as such introduces noise in the estimation
of trait means. However, when accurate estimates of mean trait
values can be obtained for different accessions, any observed differences can be attributed to genetic variation. The proportion of
genetic variation in relation to the total variation is referred to as
broad sense heritability, expressed as H2 = Vg/(Vg + Ve), where Vg is
the genetic variation and Ve is the residual variation. Broad sense
heritability estimates indicate how much of the observed phenotypic
variation can be explained by genetic factors in a given experimental setup. In general, it is more likely to detect QTLs for traits with
high heritability values, especially if the genetic variation is explained
by a limited number of loci (see Note 2). When good heritabilities
can be obtained, two genotypes should be chosen as the parents of
146
Johanna A. Molenaar and Joost J.B. Keurentjes
the mapping population that will be used for further QTL analysis.
Although traits can segregate in progeny of phenotypically similar
parents, it is usually best to choose two opposing extremes as parents. These extremes are most likely to differ for genetic factors
controlling the trait of interest. Creating a new RIL population,
however, is laborious and time-consuming, and screening an existing population with less extreme parents may then be preferable.
3.2 Development
of a Population
of Recombinant
Inbred Lines
When no suitable genetic resources are available for the trait of
interest, a novel mapping population can be created. The development of an RIL population is rather straightforward but laborious
and time-consuming. To reach full homozygosity, each line needs at
least eight generations of inbreeding. The time needed to complete
a population depends, therefore, on the life history cycle of the individuals which is largely controlled by the time required to flower.
Some accessions, like the frequently used lab strains Columbia and
Landsberg erecta, flower within a month after germination at longday conditions. Other accessions can flower much later or might
even need a vernalization treatment of several weeks to induce flowering. Many accessions also produce dormant seeds which delays the
time between rounds of inbreeding because a certain time of afterripening is required. Another feature to consider before starting
developing RILs is the population size needed. The size of a RIL
population is an important factor that influences the detection of a
QTL. Larger population sizes increase the QTL detection power
and resolution. From various studies, it is clear that QTLs explaining approximately 10 % of the total variance have roughly an 80 %
chance of being significantly detected in a population of 200 individuals. The probability of detecting a QTL is decreasing more or
less linearly with smaller population sizes [6, 21]. Most existing RIL
populations consist of 100–200 individuals. Given the genome size
of Arabidopsis and inbreeding until full homozygosity, introgressions in individual RILs will span on average 6–12 Mb (~30–60 cM)
leading to a mapping resolution of 1–2 Mb (~5–10 cM) in such
medium-sized populations. It is recommendable to develop a larger
number of RILs from which a core collection can be selected that is
optimized for recombination frequency and allele distribution. The
subsequent steps to create a RIL population are described below:
1. Grow the two parental accessions simultaneously so that they
flower at the same time. Use a binocular to remove the anthers
of flowers of the female plant (emasculation) to prevent selffertilization and pollinate the stamen by hand with pollen of
the male plant (see Note 3). Harvest the F1 seeds when the
silliques become yellow. Seeds might be dormant, and it is better not to use freshly harvested seeds for the next round, but
to store them for at least 1 month. The residual dormancy can
be broken by incubating seeds in cold conditions for 3–5 days,
before germinating.
Exploiting Natural Variation
147
2. Make sure to check whether the cross in step 1 was successful
by testing the F1 plants for heterozygosity with polymorphic
markers. F2 seeds are generated by selfing the obtained F1s.
Because the F2 seed is the result of a fusion of two recombined
F1 gametes, each germinated F2 plant consists of a 1:1 mosaic
of the two parental genomes. Since meiotic recombination
occurs at random, no two gametes, or F2 plants, are identical,
and the two parental genomes segregate independently.
3. From the F2 onward, individual plants have a unique genetic
makeup and are propagated by single seed descent. Grow as
many F2 plants as needed to reach the desired population size
and label each plant with a unique identifier. Make sure that
plants cannot cross-pollinate but are self-fertilized. Seeds need
to be harvested from each plant separately, and a few seeds are
used to grow the next generation. From these, a single plant is
randomly chosen to harvest seeds from. Be careful not to bias
the selection by favoring the best looking or earliest flowering
plant. To circumvent any unintentional selection bias one can,
for instance, always harvest the third replicate of a line in any
generation. In each generation, only a single plant is harvested
per line, and seeds from this plant are used for the next generation. Repeat this procedure until the F8 is reached. In every
generation of inbreeding, the amount of heterozygosity is
halved reaching less than 0.5 % (½8) in the F8. From this generation onward, plants are almost completely homozygous,
and lines can be bulk propagated. The RILs can now be used
for genotyping and phenotyping studies (see Note 4).
3.3 Development
of a Linkage Map
In order to assign phenotypic variation to specific genomic differences between individuals of a population, they need to be genotyped. Each line of a RIL population consists of a mosaic of
maternal and paternal genomic introgressions. Genetic markers are
used to elucidate which regions descended from the mother or
father line, respectively. Mapping populations can be genotyped
with any marker technique available. The first used genetic markers
were morphological polymorphisms with an easy observable
(mutant) phenotype. Here, a single polymorphism is responsible
for a change in phenotype and is therefore segregating in a
Mendelian fashion. The first published genome-wide linkage maps
of Arabidopsis consisted of artificially induced phenotypic mutant
markers [22]. With the introduction of PCR (polymerase chain
reaction) technology in the eighties, it became possible to develop
markers based on sequence differences without a clearly related
phenotype [23]. Genomic polymorphisms, like deletions, insertions, and single-nucleotide polymorphisms (SNPs), are much
more abound in natural accessions and can be detected on the
DNA level, independent of a phenotype or developmental stage.
PCR-based markers can be classified as dominant and codominant.
148
Johanna A. Molenaar and Joost J.B. Keurentjes
Dominant markers, e.g., AFLPs and RAPDs, only give information about the presence or absence of an allele. No distinction can
be made between individuals being heterozygous or homozygous
for the dominant allele. Because both parents of the population
will carry dominant alleles for specific markers, a separate map for
each parent often needs to be created. These maps can be integrated using codominant markers or specialized software which
recognizes male or female dominant markers. Codominant marker
technology, e.g., INDELs and microsatellites, also provides information about allele dosage and is the preferred method of choice
nowadays. Currently, in Arabidopsis molecular markers can be
detected by high-throughput genotyping technologies such as
hybridization arrays [24] and next-generation sequencing [25].
These technologies enable the detection of a large fraction of the
genomic variation, e.g., SNPs, INDELs, and genome rearrangements, between population individuals.
To be able to perform genetic mapping, accurate maps with
sufficient marker density are needed. A genome-wide linkage map,
i.e., the order and position of the markers in the genome, can be
created by determining the recombination frequencies between
markers. As outlined above, the smaller the physical distance
between two markers, the lower the recombination frequency
between them. Distances between markers are expressed in centiMorgan (cM); 1 cM corresponds to 1 recombination event per
100 meioses. In Arabidopsis, the markers are placed in five linkage
groups corresponding to the five chromosomes. A marker is
assigned to a particular linkage group if it shows significant linkage
to any marker belonging to that group. To determine which group
corresponds with which chromosome, it is needed to gain information about the physical position of at least one marker in each linkage group. For most sequence-based markers, physical information
is publicly available and can be obtained via TAIR. Many morphological markers have also been cloned, and the positions of their
corresponding mutations are known [26]. Linkage maps can nowadays be easily created using dedicated software packages such as
JOINMAP or MAPmaker. For proper QTL analysis, each position
in the genome needs to be in linkage disequilibrium with at least one
molecular marker. The amount of markers needed to satisfy this condition depends on the LD decay. Populations with a fast LD decay
need more markers than populations with a slow LD decay. For RIL
population sizes smaller than 200 lines, a density of 1 marker per
5 cM is sufficient to detect the vast majority of crossovers. Unequal
distribution of markers over the genome leads to larger confidence
intervals and lower detection power than needed [27]. The subsequent steps to create a linkage map are described below:
1. To genotype each individual RIL of the population, it needs to
be grown to collect plant material for DNA extraction.
Exploiting Natural Variation
149
Depending on the genotyping technology, various extraction
protocols are available. DNA samples are labelled according to
their respective line number.
2. Choose a preferred marker technology, and use the extracted
DNA of each individual for genotyping. Each individual should
be genotyped with the same markers. For medium-sized populations of Arabidopsis, 100 evenly spaced markers correspond
approximately to a 5 cM resolution. Score the genotype of all
individuals for each analyzed marker in a genotype file. Check
each individual for quality, and remove individuals with many
missing data or spurious genotype calls. Check marker quality
and remove low-quality markers (see Note 5).
3. Use a genetic mapping software package to determine the
linkage between markers and to assign them to one of the five
chromosomes of Arabidopsis. Such programs estimate the
recombination frequency and its statistical significance for all
pairwise combinations of markers. Markers are assigned to different linkage groups in a specific order. Determine the corresponding chromosome of each linkage group by checking
the physical position of some markers. Inspect the resulting
linkage map for gaps and include more markers where appropriate. Check each marker of the final map for segregation
distortion and determine the cause (see Note 6).
3.4
Linkage Mapping
When a genetic linkage map and the corresponding marker genotype data for each individual of the population are available, QTL
analyses can be performed. For this, each RIL needs to be analyzed
for a specific trait, and the segregation of trait values is then compared to the segregation of the two parental genotypes over all
marker positions. Significant co-segregation is then defined as a
QTL. The most basic QTL analysis is performing a student’s t-test
for each marker, in which the subset of RILs with the maternal
genotype is tested against the paternal subset (for populations that
contain heterozygous lines, see Note 7). For the use of dominant
markers, see Note 8. More sophisticated software packages have
automated this procedure for genome-wide analysis and use a
variety of different algorithms to optimize for speed and accuracy.
MapQTL and QTL Cartographer are most frequently used in
Arabidopsis, but other packages such as QTL express, PlabQTL,
and plugins available for the statistical platforms R and Genstat are
also in use. Such programs need three types of input files: a file
with the phenotype data for each line of the population, a second
one with the marker genotype data for each line, and a file with the
genome-wide linkage map. Most software packages allow the user
to choose which method will be used for the QTL analysis. The
simplest method is referred to as single-marker ANOVA in which
150
Johanna A. Molenaar and Joost J.B. Keurentjes
the mean values of the two genotype groups will be evaluated per
marker. This results in a t- or F-statistic for each marker position.
However, most methods used today apply interval mapping in
which also positions between markers can be tested. On positions
in marker intervals, the QTL likelihood is estimated using the
recombination frequency between neighboring markers. The LOD
score (logarithm of the odds) or deviance (D) are used to express
the significance of genotypic differences (see Note 9). The power
to detect multiple simultaneously segregating QTLs can be
increased by modifying the statistical model. In the modified
model, the presence of validated QTLs is taken into account when
testing for QTLs at other loci. Such analyses are known as composite interval (CIM) or multiple QTL model (MQM) mapping. A
marker most closely linked to a known QTL is added to the statistical model as a cofactor to correct for the effect of that QTL (see
Note 10). The output of mapping programs consists of graphical
displays of LOD scores along the genome, where significant scores
indicate QTL positions. Significance levels are usually determined
by a permutation test. More detailed information about the additive effect and explained variance for each genomic position tested
is given in result tables. The additive effect of a locus is defined as
the difference between the mean trait value of the two genotypic
classes. The explained variance indicates which part of the total
variance is explained by a particular locus or all loci. Because the
LD decay is quite low in RIL populations, the position of QTLs is
often assigned to confidence or support intervals. Most commonly
used are 2-LOD support intervals which span a region in which
the highest QTL LOD score drops 2 units. Identified QTLs can be
further tested statistically for genetic interactions with other
genomic regions (epistasis). For this, standard statistical analyses,
such as ANOVA, will suffice.
The effect of the identified QTLs can be validated in NILs or
HIFs. Such lines can also be used for fine-mapping purposes when
backcrossed to one of the parental lines. The ultimate goal here is
the molecular isolation of the genes underlying individual QTLs
(causal gene or quantitative trait gene) and the identification of the
DNA polymorphism altering the function of the gene and causing
the phenotypic variation (causal nucleotide or quantitative trait
nucleotide). Functional analysis of candidate genes can be started
if the support interval of the QTL is small enough to follow up all
underlying genes or if an obvious candidate gene is available.
Mutants, knockout or gene silencing (RNAi), and overexpression lines can be analyzed for an effect on the trait of interest (see
Note 11). Complementation of a mutant phenotype by transformation or crossing provides another line of evidence that natural
variation in a gene is causal for a certain phenotype. (Re)sequencing of the QTL region or candidate genes can give information
Exploiting Natural Variation
151
about possible causal nucleotide polymorphism. The subsequent
steps to perform a QTL analysis are described below:
1. Grow each individual of the population in as many replicates
as possible to acquire the best estimate of a line’s trait value.
Although replicates are not strictly necessary, since genotypic
replication is present in the population structure, it often
improves mapping power. In addition, it allows for heritability
estimates. Make sure to grow the parental lines as well to
determine the parental differences. Quantify the trait of interest with an appropriate assay as accurate as possible. Standardize
measurements as much as possible, in terms of developmental
stage and environmental conditions, for all individuals.
2. Enter the quantitative trait data in a loading file of the appropriate format for the software to use. Follow the manufacturer’s instructions for loading trait data, genotypes, and genetic
map into the preferred program and run the QTL analyses.
Determine significance thresholds for each trait separately and
record QTLs, their additive effect, and explained variance.
3. Each detected QTL needs to be confirmed with independent
genetic resources such as NILs or HIFs. The effect of the
QTL can be tested in relation to other genomic regions (epistasis). Once the effect of the QTL and its genetic regulatory
mechanism is validated, fine mapping and cloning of the causal
gene is required. When relevant, experiments can be repeated
in different conditions to determine any genotype-by-environment interactions.
4
Notes
1. Most probably seeds from the stock center are stored for a
long time. So instead of using them directly for your mapping
experiment, it is advisable to first propagate them. This guarantees that all seeds are developed on plants that are grown in
the same conditions and prevents the detection of differences
on the basis of longevity of the seeds.
2. The heritability of a trait can be increased by reducing the
variation within genotypes. Analysis of more replicates leads to
such a reduction. Also very controlled growing conditions and
an accurate phenotyping assay help to minimize this residual
variation.
3. For some traits, the cytoplasmic background (chloroplasts and
mitochondria) can be important. It might be helpful to map
QTLs to the cytoplasmic genome. For this, reciprocal crosses
need to be made, and the parental genotype of the cytoplasm
of the resulting progeny is a marker to be included as an extra
linkage group.
152
Johanna A. Molenaar and Joost J.B. Keurentjes
4. Sometimes seeds of the F5 or F6 generation are bulked and
genotyped. These generations still contain one to three percent of heterozygosity which can be used to generate heterogeneous inbred families (HIFs). A HIF is derived from a RIL
containing a small heterozygous region in an otherwise homozygous background. The heterozygous regions will segregate
in the next generation resulting in fixed parental genotypes in
half the number of progeny. These fixed lines are very similar
to NILs and can be used to confirm a QTL in that specific
region.
5. Erroneous marker data can inflate genetic maps tremendously,
especially at short distances. Therefore, wrong data are much
worse than missing data, and only high-quality data should be
used for mapping purposes.
6. Segregation distortion of markers can be due to genotyping
errors, which is usually the case when observed for isolated
markers. These markers should be removed from the analyses.
When distortions are caused by genetic incompatibilities, all
markers in LD with the incompatibility locus will show a
skewed segregation. This will result in lower mapping power
and can only be resolved by choosing different parental lines.
7. Depending on the population type, heterozygous regions may
be present in individual lines. In this case, three genotypic
classes occur: homozygous male, homozygous female, and
heterozygous. Most software packages can deal with this and
in addition offer the possibility to estimate dominance and
additive effects.
8. Dominant markers, such as AFLPs, do not allow distinguishing between homozygous and heterozygous loci. If lines carrying heterozygous regions are present in the population and
dominant markers are used for the genotyping, specific software is needed for the analysis.
9. LOD scores are calculated by comparing the likelihood of data
in presence (H1) of a QTL to non-presence (H0). In short:
LOD score = 10log (L(data|H1)/L(data|H0)) and D = 2 ×
ln(L(data|H1)/L(data|H0)). LOD can be calculated from D
and vice versa: LOD = 0.217D and D = 4.605LOD.
10. Placing cofactors is a delicate task, because it can easily manipulate or overfit results. It is possible in most programs to use
automatic cofactor selection procedures, in which unbiased
selection of markers is applied.
11. Almost all publicly available mutant lines are in the Columbia
background. Note that the allele of Columbia might differ
from the alleles of your parents. Therefore, it can be needed to
create a mutant in the desired background by RNAi.
Exploiting Natural Variation
153
References
1. Doebley JF et al (2006) The molecular
genetics of crop domestication. Cell 127:
1309–1321
2. Izawa T et al (2003) Comparative biology
comes into bloom: genomic and genetic
comparison of flowering pathways in rice and
Arabidopsis. Curr Opin Plant Biol 6:
113–120
3. Hoffmann MH (2002) Biogeography of
Arabidopsis
thaliana
(L.)
Heynh.
(Brassicaceae). J Biogeogr 29:125–134
4. Alonso-Blanco C, Koornneef M (2000)
Naturally occurring variation in Arabidopsis:
an underexploited resource for plant genetics.
Trends Plant Sci 5:22–29
5. Kooke R et al (2012) Backcross populations and
near isogenic lines. In: Methods in Molecular
Biology: Quantitative Trait Loci (QTL) Analysis,
Methods and Protocols (S.A. Rifkin ed),
Humana press inc., Totowa, NJ. Methods Mol
Biol 871:3–16
6. Keurentjes JJB et al (2007) Development of a
near-isogenic line population of Arabidopsis
thaliana and comparison of mapping power
with a recombinant inbred line population.
Genetics 175:891–905
7. Törjék O et al (2008) Construction and analysis of 2 reciprocal arabidopsis introgression line
populations. J Hered 99:396–406
8. Ravi M, Chan SW (2010) Haploid plants produced by centromere-mediated genome elimination. Nature 464:615–618
9. Liu SC et al (1996) Genome-wide highresolution mapping by recurrent intermating
using Arabidopsis thaliana as a model. Genetics
142:247–258
10. Kover PX et al (2009) A multiparent advanced
generation inter-cross to fine-map quantitative
traits in Arabidopsis thaliana. PLoS Genet
5:e1000551
11. Huang X et al (2011) Analysis of natural allelic
variation in Arabidopsis using a multiparent
recombinant inbred line population. Proc Natl
Acad Sci 108:4488–4493
12. Balasubramanian S et al (2009) QTL mapping
in new Arabidopsis thaliana advanced
intercross-recombinant inbred lines. PLoS
One 4:e4318
13. Brachi B et al (2010) Linkage and association
mapping of Arabidopsis thaliana flowering
time in nature. PLoS Genet 6:e1000940
14. Bentink L et al (2010) Natural variation for
seed dormancy in Arabidopsis is regulated by
additive genetic and molecular pathways. Proc
Natl Acad Sci 107:4264–4269
15. McMullen MD et al (2009) Genetic properties
of the maize nested association mapping population. Science 325:737–740
16. Nordborg M et al (2002) The extent of linkage disequilibrium in Arabidopsis thaliana.
Nat Genet 30:190–193
17. Kim S et al (2007) Recombination and linkage
disequilibrium in Arabidopsis thaliana. Nat
Genet 39:1151–1155
18. Atwell S et al (2010) Genome-wide association
study of 107 phenotypes in Arabidopsis thaliana inbred lines. Nature 465:627–631
19. Filiault DL, Maloof JN (2012) A genomewide association study identifies variants
underlying the Arabidopsis thaliana shade
avoidance response. PLoS Genet 8:e1002589
20. Weigel D, Mott R (2009) The 1001 genomes
project for Arabidopsis thaliana. Genome Biol
10:107
21. Ooijen JW (1992) Accuracy of mapping quantitative trait loci in autogamous species. Theor
Appl Genet 84:803–811
22. Koornneef M et al (1983) Linkage map of
Arabidopsis-thaliana. J Hered 74:265–272
23. Semagn K et al (2006) An overview of
molecular marker methods for plants. Afr J
Biotechnol 5:2540–2568
24. Borevitz JO et al (2003) Large-scale identification of single-feature polymorphisms in
complex genomes. Genome Res 13:513–523
25. Mardis ER (2008) The impact of nextgeneration sequencing technology on genetics. Trends Genet 24:133–141
26. Meinke DW et al (2003) A sequence-based
map of Arabidopsis genes with mutant phenotypes. Plant Physiol 131:409–418
27. Cornforth TW, Long AD (2003) Inferences
regarding the numbers and locations of QTLs
under multiple-QTL models using interval
mapping and composite interval mapping.
Genet Res 82:139–149
Chapter 7
Grafting in Arabidopsis
Katherine Bainbridge, Tom Bennett, Peter Crisp, Ottoline Leyser,
and Colin Turnbull
Abstract
Grafting provides a simple way to generate chimeric plants with regions of different genotypes and thus to
assess the cell autonomy of gene action. The technique of grafting has been widely used in other species,
but in Arabidopsis, its small size makes the process rather more demanding. However, there are now several well-established grafting procedures available, which we described here, and their use has already
contributed greatly to understanding of such processes as shoot branching control, flowering, disease
resistance, and systemic silencing.
Key words Arabidopsis thaliana, Grafting, Graft-transmissible signal
1
Introduction
The assessment of the cell autonomy of a signaling molecule or
mutant phenotype can provide highly informative information
about gene function. This kind of analysis requires the construction of chimeric plants with cells of different genotypes. There are
several ways to achieve this, including the tissue-specific expression
of a wild-type gene in a mutant background [1] and the generation
of sectors of different genotypes following somatic recombination
or chromosome breakage [2] or transposition [3] or site-specific
homologous recombination [4] to remove an insertional mutagen.
These methods are versatile in allowing different amounts and
positions of the tissues of each genotype to be generated. However,
they are all very time consuming, requiring transgenesis and/or
construction of lines of particular genotypes and a system to mark
the different sectors and thus identify their genotypes.
In contrast, grafting is an extremely simple method for making
a chimeric plant. In some ways, it is more restricted in its applications than those mentioned above, because only a limited number
of options are available for connecting tissues of different genotypes. However, the methods are straightforward, do not require
Jose J. Sanchez-Serrano and Julio Salinas (eds.), Arabidopsis Protocols, Methods in Molecular Biology, vol. 1062,
DOI 10.1007/978-1-62703-580-4_7, © Springer Science+Business Media New York 2014
155
156
Katherine Bainbridge et al.
Fig. 1 Details of grafting procedures. Top row shows newly made grafts, from left to right: root-shoot graft
without collar; root-shoot graft without collar, with cotyledons removed; root-shoot wedge graft; and two-shoot
Y-graft. Bottom row shows later stages, from left to right: root-shoot collar graft 10 days after grafting; mature
plant showing dark-colored scar at union; and graft verification by GUS staining of Y-graft where one shoot
carried CaMV35S::GUS gene. Arrow in all pictures shows position of graft union
construction of complex transgenics or other genotypes, and
enable an almost infinite number of genotype combinations to be
tested. Grafting experiments are particularly amenable for demonstration of spatial separation of source and target, including genetic
complementation of mutant phenotypes across a graft union, direct
detection of molecules translocated in vascular sap or arriving in
receiving tissue, and/or altered expression of molecular targets
due to signal transmission.
It is now 20 years since Arabidopsis grafts were first reported
[5]. However, the most commonly adopted methods in recent
years are based on simple root-shoot grafts, performed on young
seedlings, to generate plants where the genotype of the root differs
from that of the shoot [6]. This method, with variations (Fig. 1),
is described below. In addition, it is possible to graft a seedling
shoot into the hypocotyl of a second seedling, a so-called Y-graft,
Grafting in Arabidopsis
157
to generate a plant with two genetically different shoot systems
[6]. There are also reports of success with mature rosette grafts
[7], and there is no reason why other versions should not be equally
successful. To date, Arabidopsis grafting has been reported in relation to a multitude of diverse biological processes including shoot
branching [1, 6, 8, 9], flowering time [7, 10], leaf development
[11], vascular development [12], nutrient transport [13–16], disease resistance [17], small RNA movement [15, 16, 18, 19], systemic silencing [18, 19], and wounding [20], indicating that it is
an approach with wide applicability in this species.
2
Materials
1. Sterilized, cold-treated, good-quality Arabidopsis seed of
appropriate genotypes.
2. 0.3 mm internal diameter silicon tubing (e.g., SF Medical—
Cat No. SMF3-1050, available through VWR International),
cut into 2–3 cm sections and autoclaved (see Note 1).
3. Razor blades or No. 15 scalpel blades (see Note 2).
4. Microsurgery knife: No. 15 disposable stab knife (e.g., Fine
Science Tools, cat. No. 10315-12).
5. Fine forceps.
6. 10 cm square Petri dishes.
7. ATS (Arabidopsis thaliana salts [21]) or half-strength
Murashige-Skoog salts [22] or equivalent, agar (0.8 %) or gellan gum (e.g., Phytagel, Gelrite) type gel (0.6 %), and sucrose
(1 %).
8. Dissecting microscope.
9. 22/18 °C growth cabinet.
10. 27 °C growth cabinet.
3
Methods
3.1 RootShoot Grafts
1. Under sterile conditions, sow the seed onto square petri dishes
containing ATS-agar (or equivalent), with a spacing of
7–10 mm between seeds. Place the sealed plates vertically in a
growth cabinet under standard axenic growth conditions (see
Note 3). Leave the seedlings to germinate and grow for 3 days.
2. After 3 days, move the seedlings to a growth cabinet set at
27 °C (see Note 3) for a further 2 days.
3. Cut the sterile silicone tubing into lengths of roughly 2 mm
(see Note 4).
158
Katherine Bainbridge et al.
4. Under sterile conditions, grafting can now be performed. Cut
selected seedlings transversely across the hypocotyl (see Note 5)
while on the agar plates. The root should not be disturbed and
should essentially remain in place. Remove the apical part of the
seedling and place a collar (see Note 4) over the cut hypocotyl
of the rootstock. The top of the rootstock should be about
halfway along the length of the collar. Feed the hypocotyl of a
suitably excised scion (see Note 5) into the collar such that the
base of the scion meets the rootstock. Thus a whole seedling is
reconstituted. Note that as well as the reciprocal genotype
combinations, it is necessary to include appropriate controls in
which self-grafted plants are used to reconstruct the original
genotypes, to ensure that the grafting process itself does not
affect the phenotype of interest.
5. Using a dissecting microscope, inspect the graft junctions.
The two graft parts should be in contact across the whole of
the grafting surface with no gaps. If this is not the case, the
scion should be pushed further into the collar until it does
meet the rootstock. As the success rate of the protocol is
50–70 %, it is recommended to graft twice as many seedlings
as needed for the experiment.
6. When grafting is complete, if suitably moist (see Note 6),
return the plates to the 27 °C growth cabinet for 3–4 days.
7. After this time, grafts can be assessed for healing using a dissecting microscope (see Note 7). Transfer successful grafts to
soil (see Note 8) and use a propagator lid to keep humid for
about a week.
8. At an appropriate time thereafter, plants can be phenotypically
assessed. When appropriate phenotypic data have been
recorded, the plants can be assessed for graft integrity, thus
allowing confirmation of the validity of the results (see Note 9).
3.2 Wedge Grafts
and Y-Grafts
3.2.1 Single Wedge Graft
Instead of cutting the hypocotyl transversely, grafting can also be
achieved with V-shaped “wedge-slit” connections. These are similar to many horticultural graft types. Precise cuts are essential and
are best made under well-lit dissecting microscope conditions;
magnification of 5× to 40× is ideal.
1. Make the rootstock by cutting hypocotyl transversely (with
razor blade or No.15 scalpel blade) about 1/4 distance from
top, then slit down middle of hypocotyl with microsurgery
knife (see Note 2).
2. Make the scion by cutting a very shallow-angled V shape with
microsurgery knife. The first cut should extend more than
halfway across the hypocotyl, but do not sever the root completely; otherwise the shoot moves around a lot when making the
second cut. This second cut should result in a symmetrical wedge.
Grafting in Arabidopsis
159
3. Push the scion wedge gently into the slit (which should be
same length as the wedge) in the rootstock (Fig. 1). Tissue
elasticity and surface tension will keep these grafts together
without the aid of a collar.
Some practice is needed for these cutting procedures—mainly
to achieve a very fine sawing action with the knife, rather than
pushing down with large strokes.
3.2.2 Two-Shoot Y-Graft
This is a modification of above—a wedge-shaped scion connected
into a cut in the side of an otherwise intact rootstock plant, to generate a graft with two shoots on a single root system (Fig. 1). The
rootstock plant keeps its roots. Y-grafts can be easier to cut and
assemble if hypocotyls are curved: rotate pairs of vertical plates 60°
left and right 1 day before grafting. The two shoots are then
aligned with curves facing away from each other. It is often also
necessary to trim off the majority of one cotyledon on each shoot,
to allow the two shoots to sit close together:
1. Make a shallow-angled slit into the side of hypocotyl, starting
about one-third of the way from the top and extending no
more than halfway across the diameter so that the central vascular tissue is penetrated but not severed.
2. Make a wedge-shaped scion as above (Subheading 3.2.1).
3. Assemble by aligning the shoots as well as possible for maximum contact area.
4
Notes
1. Collars are used to support the graft and hold the rootstock
and scion together during graft healing. We have found that
they increase the proportion of successful grafts. However, it
is possible to perform hypocotyl grafting without collars.
Although this is a less efficient process, it allows greater flexibility. The protocol is essentially the same as grafting with collars, with only slight alteration. For single grafts, a normal
transverse cut can be used, but a “slit and wedge” graft (see
Subheading 3.2.1) can give better results, since it holds the
scion and rootstock together more effectively. It is also possible to remove completely both cotyledons prior to grafting
(Fig. 1). This facilitates alignment of scion and rootstock lying
flat on the media, does not require use of collars, and does not
appear to reduce success rates. Another major advantage of
collarless grafting is the ability to perform two-shoot
“Y”-grafts, to test shoot-to-shoot signaling, which is not possible when a collar is used (see Subheading 3.2.2).
160
Katherine Bainbridge et al.
Grafting can also be performed on short-day grown seedlings. The seedlings should be grown at a constant 23 °C
(~100 μmol/m2/s) for 7–9 days and then grafted. After grafting, they should be returned to this temperature regime for at
least 1 week but up to 6 weeks, at which point successfully
grafted plants can be transferred to soil.
2. The razor blades should have very fine edges in order to make
clean cuts and avoid squashing the hypocotyl. Standard industrial razor blades are not appropriate. Number 15 scalpel
blades may be used, but we find the best results are given by
Wilkinson Sword “Classic” double-sided razor blades (or
equivalent). The razor blades must be sharp at all times, and so
should be changed frequently. A different blade should be
used for cutting the collars (see Note 4). For wedge-shaped
single and Y-graft connections, a disposable microsurgery
knife is ideal because of its thin and ultrasharp blade (but care
is needed to avoid damage to the delicate cutting edge).
3. For the initial 3-d period, a standard regime of 16 h light/8 h
dark, 22/18 °C, and 100 μmol/m2/s should be used. For the
second 2 days and the graft-healing period, a regime of 16/8 h
light/dark, constant 27 °C temperature, and 60 μmol/m2/s
should ideally be used. Growing the seedlings at 27 °C
increases the levels of endogenous auxin in the plant, which in
the first instance increases hypocotyl length [23], allowing
easier grafting, and in the second instance promotes callus formation and healing. The reduction in light intensity reduces
twisting of the hypocotyls. Such twisting makes grafting more
difficult and disrupts graft healing.
4. The collars used to hold grafts together are made from sterile
0.3 mm i.d. silicone tubing by slicing the tubing into ~2 mm
sections. Difficulties will be experienced in fitting the rootstock and scion together if the collars are too long. The collars
can also be slit longitudinally before use, which allows the collar to open up as the plant grows, or for the collar to be
removed after the graft has healed fully. A pointed scalpel
blade (e.g., No. 11) is best for this, and slitting can be facilitated by first inserting the point of the scalpel into uncut 3 cm
lengths of tubing then pulling the tubing over the blade cutting surface using fine forceps.
5. There are two key points to assembling successful grafts. The
first is to select the most appropriate seedlings on each plate.
For most situations, the selected seedlings should have long,
straight hypocotyls and strong root growth. There is a small
range of hypocotyl thicknesses that can be used. It is often difficult to distinguish which seedlings have the correct dimensions; trial and error is required to some degree. Seedlings
Grafting in Arabidopsis
161
with hypocotyls that do not fit into the collars easily should be
discarded, as forcing them in will damage the seedling.
Similarly, seedlings that are a very loose fit in the collar should
also be discarded as the graft will not be held together effectively. If seedlings of the correct size are used, the graft should
fit effortlessly together.
The second key element is in the cutting of the hypocotyl.
Cuts should be as clean and straight as possible. The hypocotyl
should not be squashed during cutting, and it should not be
necessary to cut into the agar to cut through the seedling.
These problems can be avoided by use of a new blade. A sharp
razor should slice through the hypocotyl with almost no resistance. In addition, preventing seedlings sinking into the gel
can be achieved by using double strength gel in the media
and/or growing the seedlings on a “raft” of cellulose membrane filter (Millipore type) on the surface of the gel. Initially,
it may take some practice to be able to cut the hypocotyls in
the correct way, and it is advisable to have a few hours’ experience on other seedlings before attempting grafting itself.
It is also important to cut the seedlings in the correct place.
Best results appear to be produced if the rootstock donor is
cut three-quarters of the way up the hypocotyl, and the scion
donor is cut halfway up the hypocotyl. In this case, both the
root of the scion donor and the shoot of the rootstock donor
cannot be used for further grafts and should be discarded. It is
possible to use all excised parts by cutting all seedlings halfway
up the hypocotyl and simply swapping scions between rootstocks, but this may increase the risk of adventitious rooting
and make insertion into the tubing more difficult.
6. Plates used for grafting should be as moist as possible at all
times, since high humidity aids the graft-healing process. It
may, however, be necessary to remove excess surface water
before grafting. If this is the case, or if the plants appear to be
drying out (e.g., indicated by dull, soft, or wilting cotyledons),
a small amount of sterile water can be added to the plates as
needed during grafting and before they are sealed up at the
end of the procedure.
7. Only truly grafted seedlings should be used; otherwise results
may be erroneous. This can only be shown definitively when
the plants are harvested (see Note 8), at which stage it is generally obvious if a graft has succeeded. Visual inspection using
a dissecting microscope should show if the scion and rootstock
have fused. However, if further confirmation is needed, a very
light pull of the scion with forceps will determine whether the
graft has united. Grafts are often connected by 4 days but
obviously strengthen further with time. Normally transfer of
162
Katherine Bainbridge et al.
successful grafts can be done 6–7 days after grafting or a little
longer for Y-grafts which need to be stronger.
Usually a proportion of the scions will have produced
adventitious roots from hypocotyl tissue within the collar,
which displace the rootstocks. These seedlings should clearly
be discarded. Scions which produce adventitious roots above
the level of the collar, but which have also joined to the rootstock, can theoretically be used, as long as the adventitious
roots are excised. However, adventitious root formation is
often a sign of poor graft connection, so rescuing grafts by
root excision may be futile.
8. Transfer plants to soil as soon as shoot (and root) growth
seems reestablished, usually 6–7 days after grafting. To minimize stress, keep everything wet during transfer—add extra
water to plates, saturate potting mix, spray plants with fine mister, and cover the tray as soon as it has been filled with plants.
Pick plants off plates carefully—“hook up” with fine forceps or
grab edge of cotyledon. With Y-grafts, be careful not to bend
the graft union—it will probably break. Drop roots into a prebored hole and gently push potting mix across to hold roots in
place. Do not bury the graft union; otherwise it is hard to
inspect and adventitious rooting will be promoted.
Keep tray vents closed for the first 3 days or so, then open
vents for another 3 days. Remove lid after about a week. Keep
growth cabinet humidity high if possible. Often a few casualties are seen soon after the lid is removed—these have poor
root systems (poor grafts or adventitious root removal was too
much for them).
9. Confirming that the plants have grafted successfully, and can
therefore be included in the dataset, is normally a destructive
process and is thus best performed after phenotypic assessment. Plants should be removed from the growth medium
intact and the graft union found. Often the silicon collar is
split by the broadening of the stem and may be absent, but the
union is usually identifiable by the clear scarring at the site
(Fig. 1). Depending on the nature of the experiment, either
the majority or all of the root tissue must originate beneath
the level of the union. Otherwise, the plants are essentially in
an “ungrafted” state.
Use of a GUS reporter gene can aid in the verification of
graft integrity. If one of the genotypes of plant carries a broadly
expressed promoter-GUS transgene (e.g., CaMV 35S::GUS;
Fig. 1), then it is possible to use GUS activity to verify the
correctly grafted plants and also to identify adventitious roots
of the “wrong” genotype.
Grafting in Arabidopsis
163
References
1. Booker JP, Chatfield SP, Leyser O (2003)
Auxin acts in xylem-associated or medullary
cells to mediate apical dominance. Plant Cell
15:495–507
2. Furner IJ et al (1996) Clonal analysis of the
late flowering fca mutant of Arabidopsis thaliana: Cell fate and cell autonomy. Development
122:1041–1050
3. Jenik PD, Irish VF (2000) Regulation of cell
proliferation patterns by homeotic genes during
Arabidopsis
floral
development.
Development 126:1267–1276
4. Woodrick R et al (2000) Arabidopsis embryonic shoot fate map. Development 127:8
13–820
5. Rhee SY, Somerville CR (1995) Flat-surface
grafting in Arabidopsis thaliana. Plant Mol Bol
Rep 13:118–123
6. Turnbull CGN, Booker JP, Leyser HMO
(2002) Micrografting techniques for testing
long-distance signalling in Arabidopsis. Plant J
32:255–262
7. Ayre BG, Turgeon R (2004) Graft transmission of a floral stimulant derived from
CONSTANS. Plant Physiol 13:2271–2278
8. Sorefan K et al (2003) MAX4 and RMS1 are
orthologous dioxygenase-like genes that regulate shoot branching in Arabidopsis and pea.
Genes Dev 17:1469–1474
9. Booker J et al (2004) MAX3/CCD7 is a
carotenoid cleavage dioxygenase required for
the synthesis of a novel plant signaling molecule. Curr Biol 14:1232–1238
10. An HL et al (2004) CONSTANS acts in the
phloem to regulate a systemic signal that
induces
photoperiodic
flowering
of
Arabidopsis. Development 131:3615–3626
11. Van Norman JM, Frederick RL, Sieburth LE
(2004) BYPASS1 negatively regulates a rootderived signal that controls plant architecture.
Curr Biol 14:1739–1746
12. Ragni L et al (2011) Mobile gibberellin
directly stimulates Arabidopsis hypocotyl
xylem expansion. Plant Cell 23:1322–1336
13. Green LS, Rogers EE (2004) FRD3 controls
iron localization in Arabidopsis. Plant Physiol
136:2523–2531
14. Widiez T et al (2011) HIGH NITROGEN
INSENSITIVE 9 (HNI9)-mediated systemic
repression of root NO3− uptake is associated
with changes in histone methylation. Proc Natl
Acad Sci USA 108:13329–13334
15. Lin SI et al (2008) Regulatory network of
microRNA399 and PHO2 by systemic signaling. Plant Physiol 147:732–746
16. Pant BD et al (2008) MicroRNA399 is a longdistance signal for the regulation of plant
phosphate homeostasis. Plant J 53:731–738
17. Xia YJ et al (2004) An extracellular aspartic
protease functions in Arabidopsis disease resistance signaling. EMBO J 23:980–988
18. Brosnan CA et al (2007) Nuclear gene silencing directs reception of long-distance mRNA
silencing in Arabidopsis. Proc Natl Acad Sci
USA 104:14741–14746
19. Melnyk CW et al (2011) Mobile 24 nt small
RNAs direct transcriptional gene silencing in
the root meristems of Arabidopsis thaliana.
Curr Biol 21:1678–1683
20. Mugford S et al (2007) The Arabidopsis transmissible wound signal. Comp Biochem Physiol
Part A Mol Integr Physiol 146:S242
21. Wilson AK et al (1990) A dominant mutation in
Arabidopsis confers resistance to auxin, ethylene
and abscisic acid. Mol Gen Genet 222:377–383
22. Murashige T, Skoog F (1962) A revised medium
for rapid growth and bioassays with tobacco tissue cultures. Physiol Plantarum 15:473–497
23. Gray WM et al (1998) High temperature promotes auxin-mediated hypocotyl elongation in
Arabidopsis. Proc Natl Acad Sci USA
95:7197–7202
Chapter 8
Agrobacterium tumefaciens-Mediated Transient
Transformation of Arabidopsis thaliana Leaves
Silvina Mangano, Cintia Daniela Gonzalez, and Silvana Petruccelli
Abstract
Transient assays provide a convenient alternative to stable transformation. Compared to the generation of
stably transformed plants, agroinfiltration is more rapid, and samples can be analyzed a few days after inoculation. Nevertheless, at difference of tobacco and other plant species, Arabidopsis thaliana remains recalcitrant to routine transient assays. In this chapter, we describe a transient expression assay using simple
infiltration of intact Arabidopsis leaves with Agrobacterium tumefaciens carrying a plasmid expressing a
reporter fluorescent protein. In this protocol, Agrobacterium aggressiveness was increased by a prolonged
treatment in an induction medium deficient in nutrients and containing acetosyringone. Besides, Arabidopsis
plants were cultivated in intermediate photoperiod (12 h light–12 h dark) to promote leaf growth.
Key words Transient gene expression, Arabidopsis thaliana, Agrobacterium tumefaciens, Leaf
agroinfiltration, Fluorescent proteins
1
Introduction
Stable transgenic Arabidopsis offer advantages in terms of a sustainable supply of plant material with homologous protein expression, the potential of mutant complementation, as well as a global
examination option throughout all tissues and cell types. Although
the often used floral dip procedure [1] generates transgenic
Arabidopsis plants with minimal labor, plants must still be grown
to maturity over several weeks. The need to harvest seed and perform selection also makes it impractical to test large numbers of
different transgene constructs. Moreover, transgene expression in
some cases could interfere with normal plant growth and development due to an overdose of the functional proteins or dominant
negative effect of nonfunctional products. Transient gene expression provides a convenient alternative to stable transformation in
analyzing gene function by virtue of its time and labor efficiency.
Jose J. Sanchez-Serrano and Julio Salinas (eds.), Arabidopsis Protocols, Methods in Molecular Biology, vol. 1062,
DOI 10.1007/978-1-62703-580-4_8, © Springer Science+Business Media New York 2014
165
166
Silvina Mangano et al.
It only takes one to several days to perform the assay in its entirety,
which allows many constructs to be assayed in parallel within a
short time and dramatically speeds up the pace of research.
Transient infiltration assays with Agrobacterium carrying a
construct of interest are a powerful tool to gain inside into gene
function, protein-protein interaction analysis, and promoter analysis [2–4]. Agrobacterium-mediated transient transformation is an
easy, routine, and consistent operation in Nicotiana benthamiana
leaves [2], and the procedure has also been adjusted to lettuce and
tomato leaves [3, 5], as well as tomato fruits [6], roots [7],
Antirrhinum floral tissues [8], and whole seedlings [9]. At difference of tobacco and other plant species, Arabidopsis still remains
recalcitrant to routine transient assays, and high transient expression levels are obtained only in some ecotypes [3, 9–12]. However,
when used as a heterologous system to express genes from the
model species Arabidopsis, tobacco may not reflect the native activity or subcellular distribution of the corresponding proteins [10].
Pilot efforts to explore an Arabidopsis equivalent of tobacco leaf
infiltration have demonstrated low-frequency success with great
variation [3, 4, 13, 14].
Efforts to increase the frequency of Arabidopsis transient transformation success and also to decrease variation using young seedlings [10], as well as transient transformation of root epidermal
cells by cocultivation with Agrobacterium rhizogenes [15], have
been described. Difficulties in Arabidopsis transient transformation
have been attributed to plant immune responses triggered by perception of Agrobacterium [12]. Using transgenic Arabidopsis
expressing AvrPto (a suppressor of plant immunity from
Pseudomonas syringae) under the control of a dexamethasone
inducible promoter, an efficient Agrobacterium-mediated transient
transformation method of Arabidopsis has been developed [12].
Nevertheless, this assay is limited to the use of transgenic plants
expressing AvrPto.
In this chapter, we describe a transient expression assay using
simple infiltration of intact Arabidopsis leaves with Agrobacterium
tumefaciens GV3101 cells carrying appropriate plasmid constructs.
This protocol increases Agrobacterium aggressiveness by a prolonged treatment in the presence of acetosyringone (AS) and
medium deficient in nutrients such as the induction one. In addition, the number of bacteria used is higher than the one used to
infiltrate Nicotiana benthamiana leaves. Finally, Arabidopsis growing conditions are controlled in order to obtain healthy plants with
an adequate leaf size to facilitate infiltration. We showed that a
fluorescent reporter gene is easily introduced in Arabidopsis leaves
and that most of the epidermal cells show fluorescence when fluorescence microscope and Confocal Laser Scanning Microscopy
(CLSM) are used.
Transient Transformation of Arabidopsis Leaves
2
167
Materials
1. Seeds of Arabidopsis thaliana Columbia (Arabidopsis Stock
Center).
2. Pots (≈6 cm of diameter), compost, and perlite.
3. Agrobacterium tumefaciens GV3101 (strain that contains the
sequences derivative of the nopaline-type disarmed Ti-plasmid
pTiC58 and rifampicin resistance gene integrated on the chromosome and the helper plasmid pMP90 (pTiC58ΔT-DNA)
with a gentamicin resistance gene) [16].
4. Binary vector carrying the gene of interest (Gi) (e.g., cloned
into pGWB, a group of vectors designed to facilitate fusions to
different reporter proteins and also purification and detection
tags [17]).
5. Kanamycin (Sigma-Aldrich) 1,000× stock solution: 100 mg/
mL in water.
6. Gentamicin (Sigma-Aldrich) 1,000× stock solution: 30 mg/
mL in water.
7. Rifampicin (Sigma-Aldrich) 1,000× stock solution: 10 mg/
mL in methanol.
8. Bacterial culture medium: YEB (yeast extract and beef) medium
(Sigma-Aldrich). Add 18 g/l agar–agar for solid medium.
9. Glycerol solutions: 10 % and 80 % v/v in water.
10. Induction medium: 0.1 % (NH4)2SO4, 0.45 % KH2PO4, 1 %
K2HPO4, 0.05 % sodium citrate, 0.2 % sucrose, 0.5 % glycerol,
1 mM MgSO4, and pH 5.7.
11. Infiltration medium: MES (Sigma-Aldrich) 10 mM, MgSO4
10 mM, and pH 5.7.
12. Acetosyringone: (Sigma-Aldrich): 200 mM in dimethyl sulfoxide (DMSO).
13. Perfluorodecalin 95 % (Sigma-Aldrich).
14. Syringes 1 mL.
15. Shaker.
16. Spectrophotometer.
17. Refrigerated centrifuge.
18. Gene Pulser II with the Capacitance Extender (Bio-Rad).
19. Microcentrifuge.
20. Fluorescence stereomicroscope equipped with a GFP Plant
(excitation 470/40 nm, emission 525/50 nm) and DsRed
(excitation 545/30 nm, emission 620/60 nm) filters and
CCD camera.
21. Confocal laser scanning microscope with a 63× (NA 1.4) oil
immersion objective.
168
3
Silvina Mangano et al.
Methods
3.1 Growing
Arabidopsis Plants
1. Fill the 6 cm pots with a mix of compost and perlite (3:1), and
compress very lightly to give a firm bed and water.
2. Sow the seeds onto the surface of the mix compost/perlite by
scattering them carefully.
3. Place the pots in a tray and transfer to a cold (4 °C) for 2–3
days in the dark, and cover with transparent PVC film to keep
them in a high humidity environment.
4. Transfer the pots to a growth room under 90 μE in light cycle
12 h light–12 h dark at 22–24 °C (see Note 1).
5. After 4 weeks, Arabidopsis plants are generally in good
conditions for transient expression assay (Fig. 1a) (see Note 2).
3.2 Transformation
of A. tumefaciens with
Binary Plasmid DNA
by Electroporation
1. Pick a single colony of the A. tumefaciens GV3101 and inoculate 3 mL of YEB with gentamicin 30 μg/mL and rifampicin
10 μg/mL in a 15 mL sterile tube. Grow at 28 °C overnight
in a shaker at 200 rpm in the dark.
3.2.1 Preparation
of Competent Cells
of Agrobacterium
2. Inoculate 500 mL flasks each containing 100 mL of YEB with
0.5 mL (1/100 volume) of the overnight culture and grow at
28 °C with vigorous shaking until OD600nm of 0.5–0.6. It takes
~4–5 h to get the cells to this stage.
3. Spin 5 min at 5,000 × g at 4 °C. Pour off supernatant.
4. Resuspend cells in 50 mL (~1/2 volume) ice-cold 10 % glycerol. Repeat spin.
5. Resuspend cells in 25 mL of ice-cold 10 % glycerol. Repeat spin.
6. Resuspend cells in 12 mL of ice-cold 10 % glycerol. Repeat spin.
7. Resuspend final pellet in 1.5 mL ice-cold 10 % glycerol.
Fig. 1 (a) Arabidopsis 4-week-old plants. (b) Using a yellow tip, create small holes in the leaves. (c) Press the
nozzle of a 1 mL syringe against the lower (abaxial) epidermis of Arabidopsis leaf
Transient Transformation of Arabidopsis Leaves
169
8. Dispense 100 μL aliquots into fifteen 1.5 mL microfuge tubes
pre-chilled on ice. Each tube will have enough cells for 2
transformations.
9. Quick-freeze the tubes in liquid nitrogen and store at −80 °C.
3.2.2 Electroporation
1. Remove one tube of competent cells from the freezer and
place it on ice. Allow to thaw slowly on ice.
2. Add 1–2 μL of DNA (50–100 ng in water) and wait for 1 min.
3. Transfer cells plus DNA to pre-chilled (on ice) electroporation
cuvettes with either 1 or 2 mm gap sizes. Make sure the white
cuvette holder from the Bio-Rad Gene Pulser II is also prechilled on ice.
4. Take the ice bucket with the cuvettes and cuvette holder to the
Gene Pulser. For cuvettes with a 2 mm gap size, adjust the
Gene Pulser II unit “Set Volts” setting to 2.5 kV and the
capacitance setting to 25 μFD. Set the resistance to 200 Ω on
the Pulse Controller Unit.
5. Place the cuvette in the cuvette holder, slide down to engage
the electrodes, and push both buttons on the Gene Pulser,
holding them until the tone sounds.
6. Add 500 μL of YEB medium directly to the cuvette immediately after the pulse and incubate in a shaker at 200 rpm and
28 °C overnight.
7. Plate 100–200 μL on selective media (i.e., antibiotic selection
for both the bacterial host strain and the plasmid).
8. Incubate plates 2 days at 28 °C when the colonies should be
visible.
9. Check the presence of the introduced vector by a Colony PCR
(see Note 3).
10. Grow a single colony in 5 mL YEB with gentamicin (30 μg/
mL), rifampicin (10 μg/mL), and kanamycin (100 μg/mL) in
the dark at 28 °C and 200 rpm (see Note 4).
11. Store as glycerol stock (800 μL of fresh overnight
culture + 200 μL sterile 80 % glycerol) at −80 °C (see Note 5).
3.3 Agrobacterium
Growing for Infiltration
1. Plate 100–200 μL of a glycerol stock on YEB medium with
30 μg/mL gentamicin, 10 μg/mL rifampicin, and 100 µg/mL
kanamycin (if the Gi is in a kanamycin resistance binary vector
such as pGWB [17]). After incubation at 28 °C, pick a single
colony of the Agrobacterium tumefaciens GV3101 containing
the plasmid of interest and inoculate 5 mL of YEB with antibiotics. Grow at 28 °C overnight in a shaker at 200 rpm in the dark.
2. Dilute the overnight culture in YEB with antibiotics to reach
an absorbance OD600nm of approximately 0.3 and add acetosyringone at 100 μM for virulence gene induction. Incubate at
28 °C and 200 rpm until the culture reach OD600nm of 0.6.
170
Silvina Mangano et al.
3. Spin the culture at 5,000 × g for 5 min.
4. Resuspend in 5 mL induction medium supplemented with
antibiotics and acetosyringone at 200 μM. Incubate at 30 °C
and 200 rpm for 3–4 h.
5. Pellet the culture at 5,000 × g for 5 min in a microcentrifuge at
room temperature.
6. Resuspend the pellet in 5 mL of infiltration medium and centrifuge as above. Repeat once.
7. Dilute the bacterial suspension with infiltration medium
supplemented with acetosyringone at 200 μM to adjust the
inoculum to an appropriate concentration (see Note 6).
3.4 Transient Gene
Expression
1. Agroinfiltration is conducted by infiltrating the agrobacterial
suspension into the abaxial surface of fingernail-sized leaves
attached to the intact plant (see Note 7). Using a yellow tip,
make small holes in the leaves (Fig. 1b).
2. Load the inoculum in 1 mL plastic syringe and press the nozzle of the syringe (no needle) against the lower (abaxial) epidermis of an Arabidopsis leaf, covering the small hole with the
nozzle and holding the leaf with a gloved finger on the adaxial
face. Introduce the Agrobacterium in infiltration medium by
slowly injection (Fig. 1c) (see Note 8).
3. Using a glass permanent maker, mark the infiltrated region.
4. Place the infiltrated Arabidopsis plants in the growth room
(light cycle 12 h light–12 h dark at 22–24 °C) for 2–5 days.
5. If the plants were infiltrated with Agrobacterium with a
fluorescent reporter, check the presence of the fluorescent
protein (FP) using fluorescence stereomicroscope equipped
with an appropriated filters (Fig. 2). Exposition time should
be adjusted with a no transformed leaf (Fig. 2a) to distinguish
the FP signal from the autofluorescent (Fig. 2b) (see Note 9).
3.5 Confocal
Imaging
1. Excise a marked area of the leaf and mount it on a glass microscope slide containing a few drops of water.
2. Fill a 1 mL plastic syringe with a needle with perfluorodecalin,
drop it over the leaf, and place the cover glass over the leaf (see
Note 10).
3. Examine with a confocal laser scanning microscope, using a
63× (NA 1.4) oil immersion objective (see Note 11). GFP was
excited at 488 nm (Ar 100 mW Laser) and detected in the
496–532 nm range. YFP was excited at 514 nm (Ar 100 mW
Laser) and detected in the 525–559 nm range (Fig. 3a).
mCherry and RFP were excited at 543 nm (HeNe 1.5 mW
laser) and detected in the 570–630 nm range (Fig. 3b). To
analyze colocalization, combine both channels (Fig. 3c) (see
Notes 12 and 13).
Transient Transformation of Arabidopsis Leaves
171
Fig. 2 Fluorescent micrographies of Arabidopsis leaves 3 days post-agroinfiltration. (a) Control leaf infiltrated
with Agrobacterium without the plasmid containing the FP. (b) Leaf infiltrated with Agrobacterium with the
plasmid containing the gene of interest fused to RFP (red fluorescent protein). Scale bar 2 mm
Fig. 3 Confocal scanning micrography of Arabidopsis leaves agroinfiltrated with ER-YFP and GI-RFP. (a) Yellow
channel. (b) Red channel. (c) Merge channel. Scale bar 10 μm
4
Notes
1. Arabidopsis is a facultative long-day plant whose flowering is
delayed in proportion to the light that the plant perceives.
This photoperiod was chosen to promote leaf growth without
altering drastically the flowering period. Arabidopsis plants are
usually watered every 2 days.
2. Older plants with larger leaves also work, but the transformation efficiency decreases rapidly with the increase of plant age.
3. When Colony PCR is performed using Agrobacterium cells,
the initial steps at 94 °C should be 10 min instead of 4 min, to
promote the lysis of the cells. After this step, add the mix containing dNTPs, primers, and DNA taq polymerase.
172
Silvina Mangano et al.
4. Agrobacterium tumefaciens GV3101 is resistant to gentamicin
(30 μg/mL) and rifampicin (10 μg/mL) and is sensitive to
kanamycin so is a good strain for use with binary vectors that
contains npt II gene.
5. Store several colonies for each vector, since there are differences in the expression levels of different colonies carrying the
same binary vector.
6. The density of the bacterial suspension is also important for
infiltration. Suspensions with an OD600nm below 0.1 result in
weak transgene expression. Infiltrations with bacterial suspensions with OD600nm above 1.0 often result in tissue yellowing
or wilting. The best results are obtained for suspension of
OD600nm between 0.4 and 0.6.
7. Agroinfiltration is preferably conducted during late afternoon
or evening; therefore, T-DNA transfer occurs overnight.
8. Plants of similar size should be selected for optimal comparisons of experimental controls and tests. In addition, infiltration should be performed with leaves of the same age. Usually,
leaves 6–8 are chosen for infiltration.
9. Observation can be performed using the whole plant without
cutting the leaf, what allow to make a temporal analyzes.
10. The perfluorodecalin has a low surface tension [18]; therefore,
it penetrates leaf stomatal pores and fills the intercellular air
spaces of the mesophyll. Treatment with perfluorodecalin
increases sensitive and improves the quality of the pictures.
11. The fluorescence is detected only in cells of the epidermis of
the leaf. No fluorescence is found in leaf mesophyll cells, indicating that Agrobacterium was only able to transfer the DNA-T
to cells of the leaf outer layers.
12. Simultaneous detection of RFP/mCherry and YFP or GFP is
performed by combining the settings indicated above in the
sequential scanning as instructed by the manufacturer.
13. When working with fusion proteins, the size of the protein of
interest (Pi) fused to FP reporter should be analyzed by
Western blot, to be sure that Pi was not separate of FP by proteolytic cleavage.
Acknowledgements
This research was supported by the Agencia Nacional de Promoción
Científica y Tecnológica (ANPCyT) through the grants PICT20070479 and PICT2010-2366 to Petruccelli Silvana and by
Universidad Nacional de La Plata (project 11X/498). Petruccelli
Silvana is a member of the Consejo Nacional de Investigaciones
Transient Transformation of Arabidopsis Leaves
173
Científicas y Técnicas de Argentina (CONICET). Silvina Mangano
is a researcher of Departamento de Ciencias Biológicas, Facultad
de Ciencias Exactas. Universidad Nacional de la Plata.
References
1. Clough SJ, Bent AF (1998) Floral dip: a simplified method for Agrobacterium-mediated
transformation of Arabidopsis thaliana. Plant J
16:735–743
2. Yang Y, Li R, Qi M (2000) In vivo analysis of
plant promoters and transcription factors by
agroinfiltration of tobacco leaves. Plant J
22:543–551
3. Wroblewski T, Tomczak A, Michelmore R
(2005) Optimization of Agrobacteriummediated transient assays of gene expression in
lettuce, tomato and Arabidopsis. Plant
Biotechnol J 3:259–273
4. Lee MW, Yang Y (2006) Transient expression
assay by agroinfiltration of leaves. Meth Mol
Biol (Clifton NJ) 323:225–229
5. Joh LD et al (2005) High-level transient
expression of recombinant protein in lettuce.
Biotechnol Bioeng 91:861–871
6. Orzaez D et al (2006) Agroinjection of tomato
fruits. A tool for rapid functional analysis of transgenes directly in fruit. Plant Physiol 140:3–11
7. Kumagai H, Kouchi H (2003) Gene silencing
by expression of hairpin RNA in Lotus japonicus roots and root nodules. Mol Plant Microbe
Interact 16:663–668
8. Shang Y et al (2007) Methods for transient
assay of gene function in floral tissues. Plant
Methods 3:1
9. Li JF et al (2009) The FAST technique: a simplified Agrobacterium-based transformation
method for transient gene expression analysis
in seedlings of Arabidopsis and other plant
species. Plant Methods 5:6
10. Marion J et al (2008) Systematic analysis of
protein subcellular localization and interaction
11.
12.
13.
14.
15.
16.
17.
18.
using high-throughput transient transformation of Arabidopsis seedlings. Plant J 56:
169–179
Boyko A, Matsuoka A, Kovalchuk I (2011)
Potassium chloride and rare earth elements
improve plant growth and increase the frequency of the Agrobacterium tumefaciensmediated plant transformation. Plant Cell Rep
30:505–518
Tsuda K et al (2012) An efficient
Agrobacterium-mediated transient transformation of Arabidopsis. Plant J 69:713–719
Rakousky S et al (1997) Transient
β-glucuronidase activity after infiltration of
Arabidopsis thaliana by Agrobacterium tumefaciens. Biol Plant 40:33–41
McIntosh KB et al (2004) A rapid
Agrobacterium-mediated Arabidopsis thaliana
transient assay system. Plant Mol Biol Rep
22:53–61
Campanoni P et al (2007) A generalized
method for transfecting root epidermis uncovers endosomal dynamics in Arabidopsis root
hairs. Plant J 51:322–330
Koncz C, Schell J (1986) The promoter of
TL-DNA gene 5 controls the tissue-specific
expression of chimaeric genes carried by a
novel type of Agrobacterium binary vector.
Mol Gen Genet 204:383–396
Nakagawa T et al (2007) Improved gateway
binary vectors: high-performance vectors for
creation of fusion constructs in transgenic
analysis of plants. Biosci Biotechnol Biochem
71:2095–2100
Sargent JW, Seffl RJ (1970) Properties of perfluorinated liquids. Fed Proc 29:1699–1703
Chapter 9
iTILLING: Personalized Mutation Screening
Susan M. Bush and Patrick J. Krysan
Abstract
One powerful approach to studying gene function is to analyze the phenotype of an organism carrying a
mutant allele of a gene of interest. In order to use this experimental approach, one must have the ability to
easily isolate individual organisms carrying desired mutations. A widely used method for accomplishing
this task in plants and other organisms is a procedure called TILLING. A traditional TILLING project has
at its foundation an ordered mutant population produced by treating seeds with a chemical mutagen.
From this mutagenized seed, thousands of individual mutant lines are produced, and corresponding DNA
samples are collected. For several plant species, publicly accessible screening facilities have been established
that perform mutant screens on a gene-by-gene basis in response to customer requests using PCR and
heteroduplex detection methods.
The iTILLING method described in this chapter represents an individualized version of the TILLING
process. Performing a traditional TILLING experiment requires a large investment in time and resources
to establish the well-ordered mutant population. By contrast, iTILLING is a low-investment alternative
that provides the individual research lab with a practical solution to mutation screening. The main difference between the two approaches is that iTILLING is not based on the establishment of a durable, organized mutant population. Instead, a system for growing Arabidopsis seedlings in 96-well plates is used to
produce an ephemeral mutant population for screening. Because the intention is not to develop a longterm resource, a considerable savings in time and money is realized when using iTILLING as compared to
traditional TILLING. iTILLING is not intended to serve as a replacement to traditional TILLING.
Rather, iTILLING provides a strategy by which custom mutagenesis screens can be performed by individual labs using unique genetic backgrounds that are of specific interest to that research group.
Key words TILLING, Mutagenesis, Mutation detection, Mutation screening, Reverse genetics,
iTILLING
1
Introduction
Reverse genetics is a well-established method for analyzing gene
function in plants. The reverse genetic process begins with the
scientist isolating plants that carry a mutation within a gene of
interest. These mutant individuals are then analyzed to determine
if any abnormal phenotypes can be attributed to the mutations.
TILLING (Targeting Induced Local Lesions IN Genomes) is a
commonly used reverse genetic strategy that was originally
Jose J. Sanchez-Serrano and Julio Salinas (eds.), Arabidopsis Protocols, Methods in Molecular Biology, vol. 1062,
DOI 10.1007/978-1-62703-580-4_9, © Springer Science+Business Media New York 2014
175
176
Susan M. Bush and Patrick J. Krysan
developed using Arabidopsis thaliana, and it constitutes a method
for screening an ordered population of randomly mutagenized
plant lines for the presence of mutations in any gene of interest [1].
The most widely used mutagen for TILLING experiments is the
chemical ethyl methanesulfonate (EMS), which produces mainly
single-base change mutations that convert G/C base pairs to A/T
[2]. The first step in a traditional TILLING project is to produce
an ordered population of several thousand independent mutagenized lines. These mutant lines are maintained individually or in
small pools composed of a few lines each. DNA samples are then
prepared from the ordered population, again as individuals or small
pools. In order to find mutations within this population, PCR
amplification is performed using primers that amplify sequences
from a gene of interest. These PCR reactions are performed using
DNA samples from small pools of mutant lines, typically four to
eight mutant lines per pool. Mutation detection is accomplished
using any of a variety of methods that allow one to detect the presence of heteroduplexes within a population of gene-specific PCR
products [1, 3–5]. For example, if all of the plants present in a
given pool of mutagenized lines carry the wild-type allele of the
gene targeted by PCR, all of the PCR products will be homoduplexes. However, if one of the plants in the pool carries a mutation
in the target gene, then some of the PCR products produced in
that pool will form heteroduplexes when sequences amplified from
the mutant gene anneal with wild-type copies of the same amplicon
produced from the other lines present in that pool. The traditional
method for heteroduplex detection with TILLING has been the
use of an endonuclease treatment to cleave heteroduplexes, followed by gel electrophoresis to visualize cleavage products and
identify associated plants or pools carrying a mutation [3]. More
recently, high-resolution melting analysis of PCR amplicons has been
shown to be an effective strategy for identifying heteroduplexes in
the context of TILLING screens [4, 6].
Because establishing a traditional TILLING project involves a
substantial investment of time and resources, it is not a practical
solution for the typical lab that wishes to perform their own reverse
genetic screen using a genetic background of their choice. By contrast, the iTILLING procedure described in this chapter has been
specifically developed to meet the needs of the individual laboratory that wishes to screen for mutations in a species or a genetic
background for which a traditional TILLING population is not
available. iTILLING accomplishes this goal by removing the need
to invest large amounts of time and money creating a durable,
ordered mutant population. Because it is based on the establishment of an ephemeral mutant population, iTILLING provides
users with the ability to quickly screen for mutations within a
handful of genes (Fig. 1). The first step in the iTILLING process
is to treat seeds with the chemical mutagen EMS, followed by the
iTILLING
177
Fig. 1 Work flow of iTILLING. Seeds are treated with EMS to produce the M1
population from which M2 seeds are collected in bulk. M2 seedlings are grown
on 96-well Ice-Cap plates, and tissue samples are collected 96 at a time. PCR
and melt-curve analysis are done to identify heteroduplex products indicative of
a mutation. Plants carrying a desired mutation are transplanted from the 96-well
plate to soil. The time required to go from initial mutagenesis to the identification
of mutations of interest is less 4 months. Figure from ref. [7]
production of the ephemeral screening population. From this
population, genomic DNA samples are collected from individuals
grown in a 96-well format. Finally, using high-resolution melting,
the population of plants is screened for EMS-induced mutations
within genes of interest (Fig. 2). Recent advances in DNA
sequencing technology have raised the possibility that, instead of
178
Susan M. Bush and Patrick J. Krysan
Fig. 2 iTILLING mutation detection using high-resolution melt-curve analysis.
Characteristic melt curves of PCR products amplified from wild-type (wt)
Arabidopsis DNA and from DNA containing a heterozygous SNP are shown.
The thick line represents the melt curve of the heteroduplex product. The thin
line represents the melt peak of the wild-type homoduplex. −dRFU/dT, negative
change in relative fluorescence units over the change in temperature. Figure
adapted from ref. [7]
high-resolution melting, direct sequencing of PCR products
could be used to screen for mutations in an iTILLING ephemeral
population.
The following protocol describes the iTILLING process,
including the high-throughput seedling growth and tissue sample
collection process called Ice-Cap, as it can be used for the detection of mutations in genes of interest in Arabidopsis and other
plants using PCR-based screening and high-resolution melt-curve
analysis.
2
Materials
2.1 EMS
Mutagenesis and
Collection of
M2-Mutagenized Seed
1. Arabidopsis seeds with a genetic background of choice.
2. 0.2 % (v/v) ethyl methanesulfonate (EMS) (see Note 1).
3. 1 L flask.
4. NaOH for cleanup.
5. Squirt bottle.
6. 0.01 % agar.
7. Flats of moistened soil.
2.2 Seedling Growth
Using Ice-Cap
1. M2 seeds.
2. 95 % ethanol.
3. Whatman filter paper.
4. Growth media: 0.5× Murashige and Skoog (MS) basal salt
mixture, 2 mM morpholinoethanesulfonic acid (MES), 0.6 %
agar (w/v), and pH 5.7. Autoclave to sterilize.
iTILLING
179
5. Seedling plates: 96-well deep well plates, e.g., Fisher Scientific
Nunc brand, 1-mL filter plates without frit (Fisher catalog no.
278012).
6. Adhesive sealing film.
7. Multichannel pipette, 20–200 μL volume.
8. Plastic reagent troughs for use with multichannel pipettors.
9. Clear plastic lid for each seedling plate.
10. Micropore tape.
11. Root plates: 96-well PCR plates with raised well rims.
12. Stainless steel ball bearings (3/32″ diameter).
13. Elastic bands to hold seedling plates and root plates together.
14. Shallow metal baking pan, e.g., jelly roll pan cookie sheet
(ca. 17″ × 12″).
15. Small submersible water pump.
16. Plastic tubing of appropriate size for the pump.
17. Ca. 26 L plastic storage bin—longer than metal sheet; deeper
than the pump.
18. Metal rack with adjustable screws for leveling (Fig. 3).
19. Plastic clamps to secure the tubing to the metal baking pan.
2.3 DNA Collection
Using Ice-Cap
1. Wooden skewers.
2. Dry ice.
3. 95 % (v/v) ethanol.
4. Freezing tolerant glass dish, e.g., Pyrex baking dish.
5. 96-Well metal thermal block for freezing root tissue.
6. Tris-EDTA solution (500 mM Tris, pH 8; 50 mM EDTA,
pH 8).
7. Thermal adhesion foil to seal root plates for DNA extraction.
8. Heat sealing machine.
9. GenoGrinder or other agitator equipped for 96-well plates.
10. Centrifuge equipped for 96-well plates.
2.4 Mutation
Screening Using DNA
Amplification and
High-Resolution
Melt-Curve Analysis
1. DNA collected from seedlings grown in Ice-Cap.
2. Gene-specific PCR primers, each at a concentration of 10 μM.
3. Dideoxynucleotide triphosphates (dNTPs) at a total concentration of 100 μM (25 μM for each dNTP).
4. Taq polymerase, stable at room temperature.
5. 10× Taq Buffer, final concentration: 750 mM Tris pH 9,
200 mM (NH4)2SO4, 30 mM MgCl2, 0.1 % (v/v) Tween 20.
6. SYTO13 double-stranded DNA-binding dye from Invitrogen.
180
Susan M. Bush and Patrick J. Krysan
Fig. 3 The Ice-Cap fountain is used to maintain a constant water level at the precise height of the tops of the
wells of the root plates. This continuous watering system ensures that the water in the wells of the root plates
does not become depleted due to evaporation or transpiration. (a) A homemade rack that supports the cookie
sheet on which the stacked Ice-Cap plates sit. (b) A closeup view of one of the 1″ nuts that provides a means
for precisely adjusting the level of the cookie sheet so that a uniform water depth is achieved across the surface
of the fountain. (c) A display of all the parts needed to construct the homemade rack for an Ice-Cap fountain.
(d) The assembled Ice-Cap fountain. A submersible fountain pump constantly moves water from the lower
reservoir to the cookie sheet, which rests on top of the homemade rack. A spring-loaded clamp is used to attach
the hose to the edge of the cookie sheet. Figure adapted from ref. [10]
7. Instrument capable of performing high-resolution DNA melting
analysis, such as the Bio-Rad CFX96 thermal cycler, equipped
with a camera to visualize changes in DNA-associated fluorescence with increasing temperatures.
8. Rubber tubing for attachment to pressurized air line.
9. Soil and pots for transplanting seedlings of interest.
3
Methods
3.1 EMS
Mutagenesis
and Collection of
M2-Mutagenized Seed
1. Imbibe and stratify seeds of the genotype of interest in dH2O
at 4 °C for 2 days (see Note 2).
2. Treat M1 seeds with 0.2 % (w/v) EMS for 16 h at room
temperature in a 1 L flask shaking at 100 rpm under low light
(see Note 1).
iTILLING
181
3. Rinse the seeds eight to ten times in water by allowing the
seeds to settle and pouring off the water. For the final rinse,
allow the seeds to soak in the water for 1 h (see Notes 3 and 4).
4. Suspend the M1 seeds in 0.01 % agar in a squirt bottle. Using
the squirt bottle, plant the seeds evenly across the moistened
soil (see Note 4).
5. Allow the M1 plants to grow to maturity. Collect the M2 seeds
from all plants in bulk (see Note 5).
3.2 Seedling Growth
Using Ice-Cap
1. Autoclave the 96-well seedling plates to sterilize them. Allow
the plates to dry in a laminar flow hood (see Note 6).
2. Seal the base of seedling plates using adhesive sealing film
(see Note 6).
3. Add 450 μL of growth media, still molten after autoclaving, to
the seedling plates. Use plastic reagent troughs and a multichannel pipettor to aliquot the media into the seedling plate
while the media is still liquefied (see Notes 7 and 8).
4. Allow the agar in the seedling plates to solidify in the flow
hood. If seeds are not going to be added to the plates immediately, cover each plate with a clear plastic lid, seal with micropore tape, and store at 4 °C.
5. Sprinkle dry M2 seeds onto dry Whatman filter paper. Dispense
95 % ethanol onto the seeds to surface-sterilize them. Allow
the seeds on the filter paper to air dry.
6. Plate the M2 seeds 1 per well onto the solidified agar surface of the seedling plates. Make sure to label the plates
(see Notes 9 and 10).
7. Cover each seedling plate with a clear plastic lid and seal using
micropore tape; wrap the plates in foil and store them in the
dark at 4 °C for 3 days to stratify the seeds.
8. After 3 days, remove the seedling plates from foil and place
them under fluorescent lights with the clear plastic lids still in
place for 4–7 days at 18–20 °C to germinate and grow. Remove
the clear lids after several days, especially if condensation occurs
(see Note 11).
9. After 4–7 days in the light, the seedlings will be ready to be
transferred to the Ice-Cap fountain. To begin this process, prepare one root plate for each seedling plate by placing a 3/32″
stainless steel bead in each well of the root plate, and then fill
the root plate with dH2O to the point that water is spilling out
of the wells (approximately 340 μL per well). Make sure to
label each root plate (see Note 12).
10. Assemble each seedling plate with its corresponding root plate
by first removing the sealing film from the base of each seedling
plate, and then inserting the base of each of the wells of the
seedling plate into the corresponding wells of the root plate.
182
Susan M. Bush and Patrick J. Krysan
Secure the upper and lower plates together using two or three
elastic bands.
11. Assemble the Ice-Cap fountain. Place the metal rack into the
storage bin. Place the metal pan atop the rack, and adjust the
screws to level the pan. Fill the storage bin two-thirds of the
way with a mixture of 3 parts distilled water to 1 part tap water
and place the submersible pump, attached to the plastic tubing, in the water. Affix the tubing to the metal pan using the
clamp, allowing the water to fill the pan and overflow into the
bin. Adjust the leveling screws once again if necessary to
achieve a uniform water level across the pan. Components and
assembly of the Ice-Cap fountain are described in Fig. 3.
12. Place the seedling/root plate assemblies in the Ice-Cap fountain. Allow the plants to grow in the Ice-Cap fountain until the
seedling roots have penetrated the agar and grown down to
the bottoms of the root plates. The ideal temperature for
growing Arabidopsis seedlings in an Ice-Cap fountain is 18 °C.
This stage of the process may take from 10 days to 3 weeks
depending on the specific growth conditions and genotype
(see Notes 13 and 14).
13. When the seedling roots have reached the bottom of the root
plate in the majority of the wells, remove the seedling/root
plate assemblies from the fountain. Insert 2–3 wooden skewers
between each seedling plate and its corresponding root plate
to slightly separate the assembled plates. The elastic bands
should remain on the assembled plates at this stage. Allow the
seedling/root plate assemblies containing the wooden skewers to stand under light for one day outside of the Ice-Cap
fountain to allow the water level to drop in the wells of the
root plates (see Note 15).
3.3 DNA Collection
Using Ice-Cap
1. On the day of root tissue collection, prepare a freezing bath in
a Pyrex dish using 95 % ethanol and dry ice. Place a 96-well
thermal block in this freezing bath and allow it to equilibrate
for 20–30 min (see Note 16).
2. Place the seedling plate/root plate assembly, still held together
by elastic bands and still containing the wooden skewers, into
the frozen thermal block. Freeze the root plate for 5 min. After
freezing, remove the assembled plates from the thermal block
and place them on the lab bench at room temperature. Remove
the elastic bands and the wooden skewers from the stacked
plates. Firmly press down on the top of the seedling plate to
“crack” the plates, and then carefully peel the root plate and
the seedling plate apart.
3. Seal the base of the seedling plates with film. Wrap the seedling
plates in foil and transfer them to 4 °C in the dark for storage
(see Note 17).
iTILLING
183
4. Allow the water in the root plates to thaw completely at room
temperature. Inspect the plates to determine if any wells have
substantially less water than average. Hand pipette distilled
water into wells that require additional water (see Note 18).
5. Add 25 μL of a Tris-EDTA solution (500 mM Tris, pH 8;
50 mM EDTA, pH 8) to each well of the root plate.
6. Seal the root plates using thermal adhesion foil using a heat
sealing machine.
7. Agitate the sealed root plates using a GenoGrinder machine
for 4 min at 1,350 strokes per minute in order to pulverize the
root tissue with the steel ball that is present in each well. Next,
centrifuge the root plates for 10 min at 2,100 × g at 4 °C to
pellet the cellular debris.
8. The supernatant liquid from the root plate will contain genomic
DNA. Dilute this supernatant in dH2O at a ratio of 1:5. Use
2 μL of this diluted extract as the template in a 20 μL PCR
reaction (see Note 19).
3.4 Mutation
Screening Using DNA
Amplification and
High-Resolution
Melt-Curve Analysis
1. In advance, design PCR primers specific for your gene or genes
of interest. In most cases, these PCR amplicons should target
regions of the gene that encode highly conserved domains of
the encoded protein or regions that maximize the probability
of identifying of nonsense mutations (see Note 20).
2. Use the DNA collected using Ice-Cap as the template for PCR
reactions that amplify targeted regions of your gene of interest.
The double-stranded DNA-binding dye SYTO13 should be
included in the PCR reaction mix. In a 20 μL reaction, use
2 μL DNA, 0.2 μM of each PCR primer, 2.5 μM SYTO13
nucleotide-binding dye, 0.2 mM each dNTP, 2 μL 10× PCR
buffer, and Taq polymerase. This PCR amplification step can
be done on any thermal cycler, without the requirement of a
fluorescence detection camera (see Notes 21–23).
3. After amplification, transfer the PCR plate to an instrument
that can perform high-resolution melting analysis. Melt the
PCR products using a protocol such as 96 °C for 30 s, 40 °C
for 15 s, ramp from 72 °C to 83 °C at 0.1 °C per s, capturing
fluorescence images at each temperature. The SYBR/FAM
emission/detection channel (450–530 nm) can be used to
detect fluorescence of SYTO13 bound to double-stranded
PCR amplicons (see Notes 23–26).
4. To identify the presence of a mutation in a given PCR amplicon,
melt-curve analysis must be performed. The presence of a
heterozygous SNP in the template DNA will result in a substantial change in the shape of the melt curve when compared
to the wild-type control. Specifically, the d(RFU)/dT melt
peak will display a distinctive shoulder on the low temperature
184
Susan M. Bush and Patrick J. Krysan
side of the curve as the result of heteroduplex products present
in the mixture (Fig. 2). A heteroduplex product, composed of
one wild-type and one mutant DNA strand, will initiate melting at a slightly lower temperature than the corresponding
homoduplex due to the single-base mismatch present in a
heteroduplex (see Notes 25, 27, and 28).
5. In a typical experiment, one will process dozens of Ice-Cap
plates, thereby extracting DNA from thousands of individual
seedlings from the M2 population. These individuals will usually be screened for mutations using a number of different
PCR primer pairs. Once an individual seedling has been identified as potentially carrying a mutation of interest, it should be
extracted from the seedling plate that has been stored at
−4 °C. Remove the seedling from the well by the application
of a low velocity stream of air from a pressurized air source
using rubber tubing to direct the air stream to the opening in
the bottom of the seedling plate. The pressurized air will cause
the agar plug to pop out of the well, with the seedling included
(see Notes 29 and 30).
6. Transplant the seedling of interest to soil, retaining the agar
plug surrounding the root tissue in order to increase seedling
viability.
7. Once the transplanted seedling has adapted to growth in soil,
collect a leaf sample and prepare a traditional DNA extraction
from the plant of interest.
8. Confirm any mutations by repeated PCR amplification and meltcurve analyses using the freshly isolated DNA template, and
then by Sanger sequencing to determine the precise mutation
(see Note 29).
4
Notes
1. EMS is a mutagen, not only for plants but also for humans.
Do all EMS work in a fume hood, and wear a lab coat and gloves
at all times. EMS may have a variable rate of mutagenicity, based
on the age of the solution and the quality of the seeds to which
it is applied. Using 0.2 % (w/v) EMS, one may expect about
50 % mortality of seeds planted. A small-scale test mutagenesis
may, however, be useful in determining the actual rate of mortality with the EMS solution and seeds intended for experimental use. For the typical iTILLING experiment, one should plan
to produce an M1 population of 10,000–20,000 individuals in
order to have a large sample of mutations from which to screen.
The total number of seeds to mutagenize will therefore depend
on the size of final mutant population that is desired and the
mortality rate achieved by the specific EMS treatment used.
iTILLING
185
2. The genetic background chosen for this mutagenesis will
depend on the specific experimental goals of the scientist performing the experiment. For example, one may wish to isolate
mutations in closely linked members of a tandemly duplicated
gene family. In this case, one could choose as the starting material a plant that is homozygous for a T-DNA insertion within
one member of this tandem gene family [7]. By mutagenizing
seed from a plant that is homozygous for the T-DNA insertion
allele, one would be able to screen for EMS-induced point
mutations in the linked gene family members.
3. Clean the EMS waste using NaOH. EMS-contaminated rinse
water should be brought to a concentration of 2 N NaOH and
glassware can be soaked in 2 N NaOH for 2 days as well. Solid
EMS waste, such as gloves and tips, should be kept separately
from other chemical waste [8].
4. After the final seed rinse, it may be useful to aliquot the treated,
rinsed seeds into 1.5 mL tubes. When planting using a squirt
bottle, dividing the seeds in advance will ensure even planting
over a large soil area. We have found that M1 plants can be
grown to maturity in soil at a density of up to 1.3 plants per
square centimeter. Planting density should take into account the
expected seedling mortality caused by the EMS treatment.
5. In this protocol, collection of the entire population of M2
seeds occurs in bulk. This is in contrast to traditional TILLING,
where M2 seeds are collected separately for each M1 individual.
Because iTILLING is designed to screen each seedling individually at a given set of genetic loci, no cataloging or storage
of seed from individual lines is required.
6. After autoclaving the seedling plates, be sure to dry the plates
thoroughly in the flow hood before applying the sealing film to
prevent poor adhesion and consequent leakage of agar. In
place of sealing film, clear plastic packing tape can alternatively
be used as a more economical alternative to seal the bottoms of
the 96-well seedling plates. To firmly and evenly affix the sealing film or tape to the seedling plate, a handheld microseal
plate roller can be used.
7. One liter of media can be used to fill approximately 18 Ice-Cap
seedling plates.
8. If adding the molten growth media by hand using a multichannel
pipettor, it is wise to add one-third of the volume of media to
all wells first, allowing it to solidify in the well, before adding
the remaining volume to each well. This will prevent or reduce
the likelihood that molten media will leak through the bottom of plates near the sealing film. The media can also be
added to the sealed plates using an automated microplate liquid
dispenser.
186
Susan M. Bush and Patrick J. Krysan
Fig. 4 The steel beads dispenser used in assembling Ice-Cap plates. Above, a
photograph of the homemade 96-well steel bead dispenser without metal balls,
made using several sheets of aluminum foil wrapped around the lid of a pipette
tip box. Below, the same device is shown with 96 steel beads of 3/32″ diameter
loaded on top of it. Figure adapted from ref. [11]
9. In addition to Arabidopsis, both tomato and rice seedlings
have been grown successfully using Ice-Cap [9, 10]. The Ice-Cap
strategy should also be useful in growing and collecting tissue
from additional species of plants as long as their seeds are small
enough to fit in the wells of a 96-well plate.
10. To plate seeds into the seedling block, use a 200 μL pipette tip
or a Pasteur pipette that has been heated to melt and seal the
opening at the tip. Moisten the tip of this modified pipette on
the agar surface, use it to pick up a single seed from the filter
paper, and then place the seed gently onto the agar. Alternatively,
seeds can be dropped into the wells of the seedling block one
at a time by carefully tapping seeds from a piece of creased
paper. Working with batches of 6–10 seeds on the sheet of
paper is most effective.
11. M2-mutagenized seedlings may have a higher rate of mortality
than wild-type seedlings. To maximize the number of seedlings
screened per plate, additional seedlings may be germinated on
agar plates and transplanted into wells in the Ice-Cap block
that contain seeds that did not germinate. It is important to
transfer the seedling plates to the Ice-Cap fountain before the
roots reach the bottoms of the wells and contact the sealing
film on the bottom of the seedling plates.
12. The stainless steel balls can be efficiently added to the root
plates using a custom-made ball-dispensing device (Fig. 4) [11].
iTILLING
187
This device is made by placing a single sheet of aluminum foil
over the surface of a 96-well PCR plate and using a marker to
note the locations of the centers of each well on the surface of
the foil. This marked sheet of foil is then placed on top of an
additional 6 sheets of aluminum foil and wrapped around the
smooth surface of a lid from a used pipette tip box. A sharp tool
such as a wooden skewer can then be used to create a divot large
enough to hold a single steel bead at each of the 96 positions.
To fill this dispensing device with balls, place the device in a Pyrex
dish and pour an excess of steel beads over the device and shake
it horizontally to remove excess beads. The root plate is then
placed over the top of the dispensing device, which is flipped over
to drop one bead into each well of the root plate.
13. The water in the fountain will need to be maintained at a level
that is sufficient to keep the pump submerged throughout the
period of plant growth. To accomplish this task, a mixture of
3 parts distilled water to 1 part tap water should be added to
the foundation every few days to replenish water lost due to
evaporation.
14. Growth of the seedling roots to the bottom of the root plate
may take anywhere from 10 days to 3 weeks. This growth rate
is based on seedlings grown in continuous light at 18–20 °C.
Wild-type seedlings will grow more quickly, on average, than a
mutagenized population of seedlings.
15. Seedling roots can be collected the day of removal from the
fountain; however, a high volume of water in the root plate can
make separation of the upper and lower plates more challenging
after freezing the root plate for tissue capture.
16. To prepare the freezing bath, place the 96-well metal thermal
block(s) into the glass dish. Cover each block with a clear
plastic lid to avoid filling the wells with ethanol or dry ice.
Pour about ½ in. 95 % ethanol in the glass dish first, and then
add the dry ice. Add more ethanol or dry ice as necessary. After
equilibrating the thermal blocks, remove the clear lids before
attempting to freeze the root plates. Place an autoclave glove
or other insulating material under the freezing bath to protect
the bench top.
17. 96-well seedling plates containing Arabidopsis seedlings can be
wrapped in foil and stored in the dark at 4 °C for at least 1
month without loss of seedling viability, thereby allowing the
researcher sufficient time to screen for mutations in a number
of different loci while the ephemeral mutant population lies
effectively dormant in the refrigerator.
18. Thawing of the liquid in the root plate can be expedited by
incubating the root plates in a thermal heat block set at
25 °C.
188
Susan M. Bush and Patrick J. Krysan
19. Different dilution rates can be empirically tested to determine
if better PCR performance is achieved with an alternative dilution ratio. This protocol produces a crude extract of soluble
cellular components as well as genomic DNA; therefore,
higher dilution levels may have the potential to produce better
PCR results in some situations. Both higher and lower dilution rates should therefore be tested when troubleshooting
the procedure.
20. EMS induces primarily G/C → A/T mutations [2], and only
4 codons can be altered in this way to produce stop codons:
CAA(Gln), CAG(Gln), CGA(Arg), and TGG(Trp). To maximize the chance of finding nonsense mutations with iTILLING, PCR amplicons should therefore be chosen that target
regions of the gene that are enriched for the four codons listed
above.
21. A liquid-handling robot or multichannel pipettes can be used
to streamline the liquid-handling steps needed to process the
PCR reactions.
22. A hot-start version of the Taq DNA polymerase should be
used when setting up the PCR reactions to allow reaction
setup at room temperature, such as a previously described
mutant form of the enzyme that has reduced activity at room
temperature [12].
23. We found that using a saturating dye, rather than nonsaturating dye, works more successfully for high-resolution
melt-curve analysis. SybrGreen (a non-saturating dye) and
EvaGreen dyes did not perform well in our hands when
screening for the presence of heteroduplexes. We found that
SYTO13 dye (Invitrogen), a saturating DNA-binding dye typically used for cell staining with flow cytometry, works well for
heteroduplex detection in PCR amplicons.
24. For high-resolution melt-curve generation, the initial melting
and reannealing steps are critical to ensure dissociation of
PCR homoduplexes and allow creation of heteroduplexes
wherever a single-base mismatch may be present in the amplicon. The optimal range of melting temperatures will vary with
PCR amplicon length and sequence composition and should
be empirically determined for each amplicon.
25. We have used the Bio-Rad CFX96 PCR Detection System to
visualize heteroduplexes in PCR products ranging from 100 to
120 bp in size. Single-base mismatches can be detected in
much longer amplicons when using a higher-resolution
melting system, such as the LightScanner System from Idaho
Technology [4, 13].
26. As an alternative to high-resolution melting, direct sequencing of PCR products utilizing next-generation sequencing
iTILLING
189
technologies could be used to identify mutations in amplicons of
interest. Methods have been developed that allow the addition
of DNA barcodes to samples during processing for DNA
sequencing, and these methods could be used to multiplex
PCR products from a number of individual lines prior to
sequencing [14, 15]. A recent implementation of this strategy
allowed for identification of SNPs from a pool of 768 individuals using a multidimensional pooling strategy and the Illumina
sequencing platform [5]. Because DNA samples prepared for
iTILLING are already in 96-well format, it would be straightforward to design pooling strategies that optimize mutation
detection and minimize cost, depending on the particular
DNA sequencing platform available. The advantage of using
DNA sequencing to screen for mutations is that one would be
able to directly identify the precise mutation present in a given
line. For an iTILLING screen based on direct sequencing, it
would not be necessary to narrow down the mutation of interest to a single plant based only on the DNA sequence data;
consequently, it would not be necessary to barcode individual
seedlings separately. For example, one could design a pooling
strategy in which the DNA sequencing data revealed the
96-well plate in which the mutation of interest was present.
Follow-up screening by targeted PCR and melt-curve analysis
could then be used to quickly identify the individual seedling
carrying the mutation of interest. Because the precise sequence
of the mutation would be known and only one 96-well plate
would need to be screened, this step in the procedure would
be cheap and efficient.
27. Traditional TILLING has intrinsically high throughput as a
result of the pooling strategies it uses. The rapid timeline of
iTILLING means that the identification of mutations does not
require pooling of DNA samples, though pooling could be
applied. Multiple plants could be grown and sampled together
on the 96-well plates, such as in the two-per-well growth strategy discussed in Note 28 [7]. Alternatively, plant tissue could
be harvested and DNA samples prepared individually and then
individual DNA extracts combined to form a pool, as in traditional TILLING. Sensitivity of the high-resolution mutation
detection platform is the main factor limiting the extent to
which DNA extracts can be pooled, and use of higher resolving
power will allow detection of single-base-change mutations in
more highly pooled samples, as well as in amplicons of greater
length [4, 13, 16].
28. The iTILLING protocol described here involves growing one
seedling per well. In an M2 population of plants, a nonlethal
induced mutation is expected to segregate in the standard
Mendelian fashion of 1:2:1. A given induced mutation is
190
Susan M. Bush and Patrick J. Krysan
therefore expected to be present in both homozygous and
heterozygous forms in the screening population. By using DNA
extracted from seedlings grown one per well, mutations that are
homozygous will not be detected since no heteroduplexes will
be present in the corresponding PCR reactions. We have found
that it is possible to identify both homozygous and heterozygous mutations in DNA samples collected from seedlings
grown 2 per well in Ice-Cap. Using our Bio-Rad CFX96
high-resolution melt system, the rate of mutation detection in
seedlings grown two per well was similar to the rate of detection of mutations in seedlings grown one per well [4, 13].
29. When a plant carrying a mutation in a gene of interest is identified using iTILLING, that plant can be transplanted to soil and
M3 seeds can be directly collected from the M2 parent. This is
in contrast to traditional TILLING, where identification of a
mutation of interest in a pooled sample would require further
screening to find individuals of interest [17].
30. When extracting a seedling of interest from the seedling plate,
use care to prevent the seedling from being destroyed in the well
or on the benchtop by the application of air of excessively high
pressure. The only seedlings that are transferred to soil are those
carrying mutations of interest, which means that very little
growth chamber space is needed to produce and screen the
entire mutant population. Most seedlings never leave the 96-well
Ice-Cap plates and are discarded at the end of the experiment.
References
1. McCallum CM et al (2000) Targeted screening
for induced mutations. Nat Biotechnol
18:455–457
2. Greene EA et al (2003) Spectrum of chemically induced mutations from a large-scale
reverse-genetic screen in Arabidopsis. Genetics
164:731–740
3. Colbert T et al (2001) High-throughput
screening for induced point mutations. Plant
Physiol 126:480–484
4. Gady ALF et al (2009) Implementation of two
high through-put techniques in a novel application: detecting point mutations in large EMS
mutated plant populations. Plant Methods 5:13
5. Tsai H et al (2011) Discovery of rare mutations
in populations: TILLING by sequencing. Plant
Physiol 156:1257–1268
6. Botticella E et al (2011) High resolution melting analysis for the detection of EMS induced
mutations in wheat SbeIIa genes. BMC Plant
Biol 11:156
7. Bush SM, Krysan PJ (2010) iTILLING: a
personalized approach to the identification of
8.
9.
10.
11.
12.
mutations in specialized genetic backgrounds.
Plant Physiol 154:25–35
Weigel D, Glazebrook J (2006) Protocol:
EMS mutagenesis of Arabidopsis seed. Cold
Spring Harb Protoc. doi: 10.1101/ pdb.
prot4621
Krysan PJ (2004) Ice-cap: a high-throughput
method for capturing plant tissue samples
for genotype analysis. Plant Physiol 135:
1162–1169
Su S et al (2011) Ice-Cap: a method for growing
Arabidopsis and tomato plants in 96-well plates
for high-throughput genotyping. J Vis Exp
57:e3280. doi:10.3791/3280
Clark KA, Krysan PJ (2007) Protocol: an
improved high-throughput method for
generating tissue samples in 96-well format
for plant genotyping (Ice-Cap 2.0). Plant
Methods 3:8
Kermekchiev MB, Tzekov A, Barnes WM
(2003) Cold-sensitive mutants of Taq DNA
polymerase provide a hot start for PCR.
Nucleic Acids Res 31:6139–6147
iTILLING
13. Montgomery J et al (2007) Simultaneous mutation scanning and genotyping by high-resolution
DNA melting analysis. Nat Protoc 2:59–66
14. Meyer M et al (2008) From micrograms to
picograms: quantitative PCR reduces the material demands of high-throughput sequencing.
Nucleic Acids Res 36:e5
15. Parameswaran P et al (2007) A pyrosequencingtailored nucleotide barcode design unveils
191
opportunities for large-scale sample multiplexing.
Nuclei Acids Res 35:e130
16. Reed GH, Wittwer CT (2004) Sensitivity and
specificity of single-nucleotide polymorphism
scanning by high-resolution melting analysis.
Clin Chem 50:1748–1754
17. Comai L, Henikoff S (2006) TILLING: practical single-nucleotide mutation discovery. Plant
J 45:684–694
Chapter 10
Tailor-Made Mutations in Arabidopsis Using Zinc Finger
Nucleases
Yiping Qi, Colby G. Starker, Feng Zhang, Nicholas J. Baltes,
and Daniel F. Voytas
Abstract
Zinc finger nucleases (ZFNs) are proteins engineered to make site-specific double-strand breaks (DSBs) in
a DNA sequence of interest. Imprecise repair of the ZFN-induced DSBs by the nonhomologous endjoining (NHEJ) pathway results in a spectrum of mutations, such as nucleotide substitutions, insertions,
and deletions. Here we describe a method for targeted mutagenesis in Arabidopsis with ZFNs, which are
engineered by context-dependent assembly (CoDA). This ZFN-induced mutagenesis method is an alternative to other currently available gene knockout or knockdown technologies and is useful for reverse
genetic studies.
Key words Arabidopsis, ZFN, NHEJ, CoDA, Mutagenesis
1
Introduction
Over the past few decades, forward genetic approaches—such as
map-based cloning—have been used to isolate numerous
Arabidopsis genes. Arabidopsis mutants cloned by these approaches
were generated through the use of ethyl methanesulfonate (EMS),
which introduces point mutations, or fast neutrons, which often
create large deletions [1]. More recently, the analysis of gene function has shifted towards using reverse genetic approaches, which
use RNAi to knock down gene expression or take advantage of
publicly available T-DNA insertion mutant lines to analyze mutant
phenotypes [2–4]. Despite a rich collection of T-DNA insertions
across the genome of Arabidopsis, there is still a need for alternative technologies that can make mutations in genes for which no
mutants are currently available. One such technology is called
TILLING (Targeting Induced Local Lesions IN Genomes), which
introduces G/C to A/T transitions through the use of the mutagen EMS [5, 6]. Although TILLING is clearly a powerful approach,
Jose J. Sanchez-Serrano and Julio Salinas (eds.), Arabidopsis Protocols, Methods in Molecular Biology, vol. 1062,
DOI 10.1007/978-1-62703-580-4_10, © Springer Science+Business Media New York 2014
193
Yiping Qi et al.
a
ZFN-R
F1
F2
F3
Fo
kl
kl
F1
F2
Fo
F3
ZFN-L
ZFN-R
F2
F3
b
F1
Fo
kl
194
ATCTTCGGCCATGAAGCTGGAGGG
TAGAAGCCGGTACTTCGACCTCCC
l
k
F1
F2
F3
Fo
ZFN-L
c
ATCTTCGGCC
ATGAAGCTGGAGGG
TAGAAGCCGGTACT
TCGACCTCCC
d
ATCTTCGGCCATGAAGCTGGAGGG
TAGAAGCCGGTACTTCGACCTCCC
Mutagenesis by imprecise NHEJ
Fig. 1 The ZFN-induced mutagenesis system (a) A pair of ZFNs is expressed in
plant cells. (b) Driven by a nuclear localization signal (NLS), the ZFNs move to the
nucleus and recognize the target DNA sequence. (c) FokI dimerization produces
a double-strand break (DSB) in the “spacer” of the target site. (d) Error-prone
NHEJ repair of the DSB leads to mutagenesis at the target site
it suffers from the narrow spectrum of mutations that can be recovered at a given locus in Arabidopsis.
Zinc finger nucleases (ZFNs) are hybrid proteins, each of
which contains a zinc finger DNA-binding domain at the
N-terminus and a nonspecific cleavage domain of the FokI restriction enzyme at the C-terminus [7, 8] (Fig. 1a). Zinc finger DNAbinding domains often consist of three to six zinc fingers, and each
finger recognizes a triplet of nucleotides (Fig. 1b). Importantly,
zinc finger domains can be engineered to specifically recognize
novel DNA sequences. ZFNs work in pairs because the FokI nuclease domains function as dimers [9]. Recognition of the target
DNA by a ZFN pair brings two FokI nuclease domains together to
make a DNA double-strand break (DSB) in the sequence between
the ZF-binding target. This “spacer” is usually 5–7 bp in length
(Fig. 1c). For three-finger ZFNs, the DNA sequence recognized
by both the left and right binding domains defines an 18 bp target
site, which is often unique in a given genome like Arabidopsis.
As cytotoxic lesions, the DSBs created by ZFNs need to be
repaired through either the nonhomologous end-joining (NHEJ)
or homologous recombination (HR) pathways. When NHEJ is
used to repair the break, the resulting mutations are typically short
deletions, but they can also be insertions or nucleotide substitutions (Fig. 1d). Thus, a variety of targeted mutations can be
Tailor-Made Mutations in Arabidopsis Using Zinc Finger Nucleases
195
introduced into a locus through ZFN-induced DSBs. ZFNs are
rapidly becoming powerful tools for making targeted mutations in
many higher organisms including plants such as tobacco,
Arabidopsis, and maize [10–14].
One bottleneck for implementing ZFN technology is engineering site-specific zinc finger arrays (ZFAs). Over the years,
genome engineers from both industrial and academic labs have
endeavored to overcome this bottleneck. Currently, ZFNs can be
engineered with multiple platforms. ZFNs are commercially available through Sigma-Aldrich® with the brand name CompoZr®,
which is based on a proprietary platform developed by Sangamo
BioSciences [15]. ZFNs can also be engineered using publicly
available platforms developed mainly by the Zinc Finger Consortium
(http://www.zincfingers.org). The first platform made available
by the Consortium is modular assembly, which is simple but may
have a relatively low rate of success [16, 17]. The second platform
is Oligomerized Pool ENgineering (OPEN), which has a success
rate close to 80 % and has yielded many ZFNs that target genes in
diverse organisms such as Arabidopsis, tobacco, zebrafish, and
human [10, 12, 18]. However, it takes time and effort for a molecular biology lab to adopt the OPEN method, because it is technically demanding. More recently, a third platform, context-dependent
assembly (CoDA), was developed to generate ZFNs that target
multiple genes in Arabidopsis, soybean, and zebrafish [19, 20].
Although the success rate of ZFNs designed with CoDA is not as
high as with OPEN [18, 19], CoDA is much easier to implement
because it only requires standard molecular cloning techniques.
Detailed protocols describing the CoDA method have been
previously published [19, 21, 22]. These protocols can be used to
obtain the ZFN of interest. In this chapter, we describe an adaptation of the CoDA protocol in which the ZFNs are made by a simple, PCR-based approach. Importantly, we also describe methods
for introducing ZFNs in plants, inducing their expression and
recovering mutations at the target locus.
2
Materials
2.1 Engineering ZFAs
with the CoDA Method
1. Plasmid 28086 (encodes a zinc finger array targeting NRF2b
[18], available from Addgene.org) (see Note 1).
2. Plasmids pCP3 and pCP4 [12].
3. pCR®8⁄GW⁄TOPO® TA Cloning Kit (Invitrogen, # K250020).
4. Cloned Pfu polymerase (Stratagene, # 600153).
5. Deoxynucleotide Solution Mix (NEB, # N0447).
6. Restriction enzyme DpnI (NEB, # R0176).
7. NEB Taq DNA Polymerase with Standard Taq Buffer (NEB,
# M0273).
196
Yiping Qi et al.
8. A thermocycler.
9. 37 °C incubator with shaking.
10. 42 °C water bath.
11. Chemically competent E. coli DH5α cells.
12. QIAquick® Gel Extraction kit (Qiagen, # 28706).
13. QIAprep® Miniprep kit (Qiagen, # 27106).
14. LB medium (1 % tryptone, 0.5 % yeast extract, 1 % sodium
chloride, 1.5 % agar for solid medium).
15. S.O.C liquid medium (2 % tryptone, 0.5 % yeast extract,
10 mM sodium chloride, 2.5 mM potassium chloride, 10 mM
magnesium chloride, 10 mM magnesium sulfate, 20 mM
glucose).
16. Antibiotics: spectinomycin, kanamycin, carbenicillin, and
gentamicin.
17. Primers (see Table 1).
2.2 Construction
of ZFN Expression
Constructs
1. Restriction enzymes XbaI (# R0145), BamHI (# R0136), NheI
(# R0131), BglII (# R0144), and EcoRV (# R0195) (NEB).
2. Plasmid pFZ87 [12] (see Note 2).
3. Plasmid pMDC7 [23] (see Note 3).
4. T4 DNA ligase (NEB, # M0202).
5. Gateway® LR Clonase® II enzyme mix (Invitrogen, #11791).
6. Primers (see Table 1).
2.3 Screen for
Arabidopsis Mutants
Induced by ZFNs
1. Competent cells for electroporation of the Agrobacterium
tumefaciens strain GV3101/pMP90 [24] (see Note 4).
2. Plant growth chamber.
3. Floral dip transformation solution: 5 % (w/v) sucrose, 10 mM
MgCl2, 0.03 % (v/v) VAC-IN-STUFF (Silwet L-77) (LEHLE
SEEDS, # VIS-02).
4. Glassine Envelopes.
5. Bleach (Sun Brite®, 5.25 % sodium hypochloride as active
ingredient).
6. Falcon® 150 × 15 mm sterile disposable polystyrene petri dish
(Becton Dickinson labware, # 1058).
7. Transgenic plant selection medium: 0.8 % agar plate containing 0.5× Murashige and Skoog with vitamins (Caisson Labs, #
MSP09), 25 μg/ml hygromycin B (Roche, # 10843555001),
50 μg/ml timentin (plantMedia, # 42010012), and 20 μM
β-estradiol (Sigma, # E2758) (see Note 5).
8. Micropore surgical tape (3 M, # 1535–1).
9. 2.0-ml sterile
02-707-355).
conical
screw
cap
tubes
(Fisher,
#
Tailor-Made Mutations in Arabidopsis Using Zinc Finger Nucleases
197
Table 1
Primers used in this protocol
Primer name
Primer sequence
Purpose
LF2-1
5′-CATACCCGTACTCATACCG-3′
Amplify “long F2”
LF2-2
5′-ACTGAAGTTGCGCATGCATATTCG-3′
Amplify “long F2”
F2RH+
5′-NNNNNNNNNNNNNNNNNNNNN
CATCTACGTACGCACACCGGC-3′
Amplify plasmid containing
new long F2
F2RH-
5′-NNNNNNNNNNNNNNNNNNNNN
GGAGAAATTTCGCATACAGATCCG-3′
Amplify plasmid containing
new long F2
F1RH
5′-TTGCATGCGGAACTTTTCGNNNNNNN
NNNNNNNNNNNNNNCATA
CCCGTACTCATACCGG-3′
Contains F1RH, amplifies
most of the ZFA
F3RH
5′-TCAGGTGGGTTTTTAGGTGNNNNNNN
NNNNNNNNNNNNNNACTGAAG
TTGCGCATGCATA-3′
Contains F3RH, amplifies
most of the ZFA
ZFA-Fusion-1
5′-AGTGGTTGGTCTAGACCCGGGGAGCG
CCCCTTCCAGTGTCGCATTTGCATG
CGGAACTTT-3′
Amplifies complete ZFA;
ends homologous to pCP3
and pCP4
ZFA-Fusion-2
5′-TTCAGATTTCACTAGCTGGGAT
CCCCTCAGGTGGGTTTTTAGGTG-3′
Amplifies complete ZFA;
ends homologous to pCP3
and pCP4
M13F
5′-GTTTTCCCAGTCACGACGTTGTA-3′
Colony PCR
pDW1789-TEF
5′- GGTCTTCAATTTCTCAAGTTTC-3′
Colony PCR
T2A-R
5′-GATTCTCCTCCACGTCACCGCA-3′
Colony PCR
T2A-F
5′- TGCGGTGACGTGGAGGAGA-3′
Colony PCR
ZFP-R
5′- CTATTAAAAGTTTATCTCGCCGTT-3′
Colony PCR
N indicates any nucleotide
10. 0.05 % sterile agar medium.
11. 1/8 in. eclipse steel balls (Abbott Ball Co.).
12. Liquid nitrogen.
13. Paint shaker.
14. Plant DNA extraction buffer or CTAB buffer (2 % hexadecyltrimethyl-ammonium bromide, 100 mM Tris, 20 mM EDTA,
and 1.4 M sodium chloride).
15. Chloroform.
16. TOPO® TA Cloning® kit for subcloning with TOP10 E. coli
(Invitrogen, # K4500).
17. Primers to amplify the genomic DNA region that spans the
ZFN target site.
18. Pots and soil.
198
3
Yiping Qi et al.
Methods
3.1 Assembly
of CoDA ZFAs Using
a Long Oligo-Based
Approach (See Note 6)
In CoDA, N-terminal-end fingers (F1 units) and C-terminal-end
fingers (F3 units) of three-finger arrays have been identified that
work well with common middle fingers (F2 units) [19]. A large
archive of 319 F1 units and 344 F3 units has been engineered to
work well with one of 18 fixed F2 units. Both amino acid and
nucleotide sequences for these units are publicly available. Thus,
using this information, one can make ZFNs through multiple
approaches, such as the modular assembly [25] or direct DNA synthesis. Recently, an oligonucleotide-based overlapping PCR
approach has been described for rapid assembly of CoDA ZFAs
[22]. In this approach, each three-finger ZFA can be made by
extension PCR using 8 different oligonucleotides.
Here we describe another PCR-based approach to assemble
CoDA ZFAs. This approach uses an existing OPEN-derived ZFA
that is used to clone a fragment encoding an F2 unit and partial
sequences for F1 and F3 units (this fragment is called a long F2
sequence) (Fig. 2a). The cloned long F2 sequence is the starting
material for assembling different ZFAs, which can be created through
three rounds of PCR. The first round of PCR creates the desired F2
unit through targeted mutagenesis (Fig. 2b). The second round of
PCR adds recognition helices to the F1 and F3 units, thereby making a nearly complete three-finger array (Fig. 2c). By using fixed
primers, the third round of PCR produces a full-length ZFA, which
is ready to be cloned into an expression vector through recombination in E. coli (Fig. 2d). Since CoDA only provides 18 different F2
units for three-finger ZFAs, it is practical to create an archive of 18
long F2 sequences. Once these 18 long F2 sequences are generated,
many custom ZFAs can be made with very little time in a cost-effective manner, because now only two rounds of PCR are required and
only two new oligonucleotides are needed for a given ZFA.
Before starting, one should make sure that CoDA-enabled
ZFN sites are present in the gene of interest. This is accomplished
using the ZiFiT Web server (http://zifit.partners.org) [26].
Otherwise, other platforms for obtaining ZFAs should be
employed, which are out of the scope of this protocol.
3.1.1 Clone a Partial ZFA
Sequence with the Desired
F2 Finger
1. PCR amplify the long F2 sequence using primers LF2-1 and
LF2-2 (see Table 1) using plasmid 28086 as a template. The
resulting PCR product is 147 bp (Fig. 2a).
2. Run a 2 % agarose gel and gel-purify the PCR product with
QIAquick® Gel Extraction kit; elute the DNA with 30 μl sterile water.
3. Clone the purified PCR product into the pCR8® vector
according to the manufacturer’s instructions (see Note 7).
Perform DNA sequencing to confirm the resulting “pCR8long F2” construct (Fig. 2b).
Tailor-Made Mutations in Arabidopsis Using Zinc Finger Nucleases
199
LF2-1
a
F1
b
F2
F3
F2RH-
LF2-2
F2
pCR8-long F2
c
F2RH+
F1RH
F2
pCR8-long F2-modifed
F3RH
First round PCR
ZFA-Fusion-1
F1
F2
F3
Second round PCR
d
ZFA-Fusion-2
F1
F2
F3
F1
F2
F3
pCP3 or pCP4
Fig. 2 Assembly of three-finger CoDA ZFAs using a long oligo-based approach (a) A middle portion of the ZFA
is first amplified from the plasmid 28086 and (b) cloned to pCR8 vector. Then, mutagenic PCR is conducted
with primers F2RH+ and F2RH−. (c) The mutagenized PCR fragment is introduced into a plasmid by recombination in E. coli, leading to a clone with the desired F2. Final assembly of a full-length ZFA requires two rounds
of PCR—first with F1RH and F3RH and then with ZFA-Fusion-1 and ZFA-Fusion-2. This results in a DNA fragment encoding the entire ZFA with homology at both ends (depicted as a filled grid) to pCP3 and pCP4. (d) The
homology allows insertion of the ZFA into both vectors through recombination in E. coli
4. Design primers (designated as F2RH+ and F2RH-, see Table 1)
that encode the F2 recognition helix of choice on the 5′ end,
such that they are complementary to the sequence that will
specify the new F2RH. Amplify the entire plasmid containing
the F2 recognition helix sequence using 1 ng plasmid “pCR8long F2” as template with cloned Pfu polymerase (see Table 2a
for conditions) (Fig. 2b).
5. Digest 5 μl of the PCR reaction with 0.5 μl restriction enzyme
DpnI in a 25 μl reaction at 37 °C overnight (see Note 8).
6. Transform 50 μl of E. coli DH5α chemically competent
cells with 5 μl of digestion product using a heat shock at 42 °C
for 45 s.
7. Recover transformed E. coli cells with 200 μl S.O.C liquid
medium and agitate at 37 °C for 1 h.
200
Yiping Qi et al.
Table 2
PCR conditions for assembly of ZFAs
Cycling
PCR regime
Initial
denaturing
Denature
Anneal
Extend
Cycles
Final extension
a
1′ at 94º
0.5′ at 94º
0.5′ at 55º
8′ at 72º
30
10′ at 72º
b
5′ at 94º
0.5′ at 94º
0.5′ at 50º
1.5′ at 72º
10
7′ at 72º
c
5′ at 94º
0.5′ at 94º
0.5′ at 56º
1.5′ at 72º
10
0.5′ at 94º
0.5′ at 64º
1.5′ at 72º
20
7′ at 72º
8. Spread 100 μl of transformed cells onto LB plates with
100 μg/ml spectinomycin. Incubate at 37 °C overnight.
9. Miniprep 2 or 3 clones using the QIAprep® Miniprep kit and
confirm the construct “pCR8-long F2-modified” through
DNA sequencing (Fig. 2c).
3.1.2 Assembly and
Cloning of the Entire
3-Finger ZFA
1. Amplify the nearly full-length ZFA using the plasmid “pCR8long F2-modified” as a template with primers F1RH and
F3RH (see Table 1) in a 50 μl PCR reaction (Fig. 2c) (see
Table 2b for conditions).
2. After the amplification, add 0.625 μl each of 10 μM ZFAFusion-1 and ZFA-Fusion-2 primers (see Table 1) and continue the PCR reaction (Table 2c for conditions). The resulting
PCR product will have terminal sequences identical to the
pCP3 and pCP4 yeast expression vectors (see Note 9).
3. Digest 1 μg pCP3 and pCP4 plasmids with BamHI and XbaI
in a 50 μl reaction at 37 °C for 4 h.
4. Run the digestion product on a 1 % agarose gel. Gel-purify the
linearized plasmids and elute the DNA with 30 μl sterile water.
5. Co-transform 75 ng of linearized pCP3 or pCP4 plasmid
backbone with 2 μl of the PCR product into 50 μl E. coli
DH5α chemically competent cells. Transformation is carried
out by a heat shock at 42 °C for 45 s (see Note 10).
6. Recover transformed E. coli cells with 200 μl S.O.C liquid
medium and agitate at 37 °C for 1 h. Then spread 100 μl of
transformed cells onto LB plates with 50 μg/ml carbenicillin.
Incubate at 37 °C overnight.
7. Screen for correct clones by colony PCR using primers M13F
and pDW1789-TEF [10] (see Table 1). Correct clones will
give PCR products of ~1.5 kb.
8. Miniprep plasmids and confirm that the correct clones, pCP3left-ZFA and pCP4-right ZFA, have been obtained by DNA
sequencing.
Tailor-Made Mutations in Arabidopsis Using Zinc Finger Nucleases
3.2 Construction
of ZFN Expression
Constructs
201
1. Digest 1 μg each of pCP3-left-ZFA, pCP4-right ZFA, and
pFZ87 with 1 μl each of restriction enzymes XbaI and BamHI
in a 50 μl volume at 37 °C for 4 h.
2. Run digestion products on a 1 % agarose gel. The digested
ZFAs from pCP3-left-ZFA and pCP4-right ZFA are 270 bp;
the digested pFZ87 vector is 4101 bp.
3. Gel-purify both the digested ZFA and elute the DNA with
30 μl sterile water.
4. Perform 10 μl ligation reactions to insert the left ZFA (~20 ng)
to pFZ87 (~20 ng) using NEB quick ligase at room temperature for 1 h.
5. Transform 50 μl of chemically competent E. coli DH5α cells
with 5 μl of the ligation reaction using a heat shock at 42 °C
for 45 s.
6. Add 200 μl S.O.C liquid medium to the transformed E. coli
cells and incubate with agitation at 37 °C for 1 h. Spread
150 μl of the recovered cells onto LB plates with 50 μg/ml
kanamycin and incubate at 37 °C overnight.
7. Identify correct clones by performing colony PCR with primers M13F and T2A-R (see Table 1).
8. Culture two PCR-confirmed clones with the left ZFA insertion overnight and miniprep the plasmids (named as pFZ87L) the following day.
9. Digest 1 μg of pFZ87_L plasmid with 1 μl each of NheI and
BglII restriction enzymes in a 50 μl reaction volume at 37 °C
for 4 h.
10. Run the digested product on a 1 % agarose gel and gel-purify
the linearized vector; elute with 30 μl sterile water.
11. Perform a ligation in 10 μl to insert the previously purified
right ZFA (~ 20 ng) into the pFZ87_L vector (~20 ng) using
quick T4 ligase at room temperature for 1 h.
12. Transform 5 μl of the ligation reaction into E. coli as described
above (steps 5 and 6).
13. Perform colony PCR using primers T2A-F and ZFP-R to confirm the insertion of the right ZFA sequence.
14. Culture two PCR-confirmed clones for miniprep of the plasmids, namely, pFZ87_L + R. Sequence pFZ87_L + R plasmid
to confirm the whole ZFN-left-T2A-ZFN-right sequence.
15. Linearize the sequence-confirmed pFZ87_L + R plasmid by
digesting 2 μg of plasmid DNA in a 50 μl reaction with EcoRV
at 37 °C for 2 h (see Note 11).
16. Run the digested product on a 1 % agarose gel and gel-purify
the linearized entry vector (4606 bp).
202
Yiping Qi et al.
a
Bulked T1
Individual T1
b
c
B
B
Uncut / NHEJ
cut
cut
Individual T2
d
HT
WT
WT
WT
WT
HM
WT
WT
HM
WT
Cloned and sequenced
e
GTATCTTCGGCCATGAAGCTGGAGGGTA (wild type)
GTATCTTCGGCCAaaGAAGCTGGAGGGTA (adh1-4)
GTATCTTCGGCCA:::AGCTGGAGGGTA (adh1-8)
GTATCTTCGGCCATatGAAGCTGGAGGGTA (adh1-16)
Fig. 3 Screen for germline-transmitted mutations. (a) T1 transgenic seedlings
are selected on medium containing hygromycin. Estradiol is included in the
medium to induce ZFN expression. (b) Some transgenic seedlings are used for
testing ZFN activity in somatic cells whereas (c) the remaining transgenic seedlings are transferred to soil to obtain the T2 generation. (d) Individual T2 plants
are then screened for germline-transmitted mutations, and plants are genotyped
as being homozygous (HM), heterozygous (HT), or wild type (WT). (e) The mutations are ultimately characterized by DNA sequencing. Note here that underlined
nucleotides represent the target sequence for both ZFN monomers
17. Conduct an LR reaction to move left-ZFN-T2A-right-ZFN
from the pFZ87_L+R entry clone to the pMDC7 destination
vector. Use Gateway® LR Clonase® II enzyme mix according
to the manufacturer’s instructions.
18. Confirm the correct pMDC7_L + R constructs by restriction
digestion and/or DNA sequencing.
3.3 Screen for
Arabidopsis Mutants
Induced by ZFNs
The major steps of the procedure are shown in Fig. 3, where the
well characterized ZFNs that target the Arabidopsis ADH1 gene
are used as an example (see Note 12).
Tailor-Made Mutations in Arabidopsis Using Zinc Finger Nucleases
3.3.1 Arabidopsis
Transformation
203
1. Transform 50 μl of competent Agrobacterium tumefaciens
cells (strain GV3101/pMP90) with 0.5 ng pMDC7_L + R
vector by electroporation with the E. coli Pulser cell-porator.
2. Add to the transformed Agrobacterium cells 200 μl LB liquid
medium and agitate at 28 °C for 1 h.
3. Spread 150 μl of the transformed Agrobacterium cells onto LB
plates with 50 μg/ml kanamycin and 50 μg/ml gentamicin.
Incubate the plates at 28 °C for 2 days.
4. Pick a single colony of transformed Agrobacterium and culture in
5 ml LB liquid medium with 50 μg/ml kanamycin and 50 μg/ml
gentamicin at 28 °C with shaking at 220 rpm overnight.
5. Pour overnight Agrobacterium culture into 200 ml LB liquid
medium with 50 μg/ml kanamycin and 50 μg/ml gentamicin.
Shake at 28 °C with 220 rpm overnight.
6. Collect cultured Agrobacterium cells by centrifugation at
6,000 × g for 10 min at 28 °C.
7. Discard the supernatant and resuspend the bacteria pellet with
400 ml Arabidopsis transformation buffer.
8. Transform Arabidopsis plants with the transformed A. tumefaciens strain using the floral dip method [27]. Briefly, immerse
flowers of Arabidopsis plants in the Arabidopsis transformation
buffer and place in a dark and humid environment overnight.
9. Keep watering plants for 3 weeks after transformation. Then
stop watering and let seeds mature and dry (see Note 13).
10. Collect seeds and dry them for at least 2 weeks before screening
for transgenic plants.
3.3.2 Screen for T1
Transgenic Plants and
Induce ZFN Expression
1. Sterilize 0.2 g Arabidopsis seeds with 30 ml 50 % bleach in a
50-ml conical centrifuge tube by mixing for 10 min.
2. Wash sterilized seeds four times with 40 ml sterile water each
time. To wash, spin the tube by centrifugation at 500 g for
1 min to precipitate seeds. Then, resuspend seeds with water
by mixing.
3. Resuspend the seeds with 20 ml 0.05 % sterile agar medium.
Keep the suspended seeds at 4 °C in the dark for 4 days.
4. Spread 5 ml of the seed suspension onto 150 mm × 15 mm
petri dishes containing transgenic plant selection medium.
Seal the petri dishes with surgical tape.
5. Place the seed-containing petri dishes in the growth chamber
at 22 °C with 24 h light.
6. After 1 week, collect six transgenic seedlings into a 2-ml screw
cap tube with a metal bead inside. Prepare two samples with 12
plants total. Do the same with the wild-type control as needed.
Keep the remainder of the transgenic plants in the chamber.
204
Yiping Qi et al.
7. Freeze the tubes with liquid nitrogen and pulverize the samples by shaking in a paint shaker for 2 min (see Note 14).
8. Add 500 μl plant DNA extraction buffer. Mix well and incubate in a 65 °C water bath for 15 min.
9. Add 500 μl chloroform and mix well; centrifuge at 15,000 × g
for 1 min.
10. Transfer 500 μl supernatant to clean 1.7 ml microfuge tubes
and add 1 ml ethanol. Mix well and centrifuge at 15,000 × g
for 1 min.
11. Remove the supernatant and add 1 ml 75 % ethanol to wash
the pellet. Mix well and centrifuge at 15,000 × g for 1 min.
12. Remove the supernatant and dry the pellet for about 10 min.
Then dissolve the DNA with 100 μl sterile water. Store plant
genomic DNA at −20 °C.
3.3.3 Testing ZFN
Activity in Somatic Cells
of T1 Seedlings
1. Design and synthesize PCR primers that amplify the genomic
DNA region which spans the ZFN target site. Meanwhile, choose
one restriction enzyme whose recognition sequence is very close
to if not located on the spacer sequence of the ZFN target site.
Make sure there is no extra or very few extra sites for the chosen
restriction enzyme in the PCR product (see Note 15).
2. Perform PCR in a 25 μl reaction volume with the designed
primers and the DNA from the bulked T1 seedlings as a template. Include a wild-type DNA sample as a control.
3. Digest 10 μl of each PCR product in a 40 μl reaction volume
with the chosen restriction enzyme overnight.
4. Run a 2 % agarose gel and check for restriction enzyme
resistant bands, which indicate the presence of ZFN-induced
mutations. Such digestion-resistant bands indicate that the
ZFNs not only are active but also have a high in vivo activity
(see Note 16).
3.3.4 Screen for
ZFN-Induced Mutants
1. Transfer 7~10-day old T1 seedlings from transgenic plant
selection medium to soil (see Note 17).
2. Maintain the plants until mature (see Note 18). Collect T2
seeds from individual T1 plants and dry seeds in Glassine
Envelopes.
3. Plant seeds from ten T2 populations derived from ten individual T1 parents in potting mix. Use ~100 seeds for each T2
population.
4. After 3 weeks, collect one leaf from each plant and extract
genomic DNA with as described in Subheading 3.3.2.
5. Perform the screen involving PCR and digestion as described
in Subheading 3.3.3, to identify mutant plants (see Note 19).
Tailor-Made Mutations in Arabidopsis Using Zinc Finger Nucleases
4
205
Notes
1. Any plasmid containing ZFAs generated by the OPEN method
should work for this protocol. The OPEN-derived ZFA merely
serves as a template for PCR. Here we suggest plasmid #28086
because it has worked well in our hands.
2. Plasmid pFZ87 is a derivative of the Gateway® entry vector
pENTR/D-TOPO® that contains a FokI-T2A-FokI coding
sequence where both FokI nuclease domains are obligate heterodimers [28]. Plasmid pFZ87 is available from the Voytas
lab upon request. T2A is the insect virus Thosea asigna “selfcleaving” 2A peptide which allows production of two proteins
from one mRNA through a translational skipping mechanism
[29]. Thus, inclusion of the T2A sequence allows efficient
expression of the left and right ZFNs from one transcript.
3. Plasmid pMDC7 is a Gateway® destination T-DNA binary
vector for estrogen-inducible expression in plants which can be
ordered from ABRC Stock Center (http://www.arabidopsis.
org). The use of the estrogen-inducible promoter minimizes
potential cytotoxicity of ZFNs when compared to constitutive
expression. Also note that pMDC7 should be propagated in E.
coli DB3.1 because it contains the CcdB gene which is a negative selectable marker. The toxin encoded by the CcdB gene
inhibits growth of normal E. coli strains such as DH5α, but not
DB3.1 due to presence of the antitoxin in this strain.
4. Agrobacterium tumefaciens strain GV3101/pMP90 is a common
lab strain which is available from the Voytas lab upon request.
5. Hygromycin B is the marker in the T-DNA vector pMDC7 for
transgenic plant selection. Timentin is an antibiotic for killing
Agrobacterium. Agrobacterium is often hard to be completely
removed from seeds by standard surface sterilization procedures. Having β-estradiol in the medium allows for transcriptional induction of the ZFNs.
6. A three-finger ZFA coding sequence is about 270 bp, so it is
always an option to obtain a ZFA through direct DNA synthesis. This approach is recommended by the authors of the original CoDA protocol. We recommend choosing more than one
ZFN site for a given gene, if possible, because in our experience, only about 50 % of the CoDA-derived ZFNs are functional. The method described here will be much more
cost-effective compared to synthesizing ZFAs if several ZFAs
need to be assembled for testing.
7. Any cloning vector of relatively small size (<4.5 kb) will work,
as long as the entire vector can easily be PCR-amplified.
8. DpnI recognizes and digests only methylated DNA, but not
unmethylated PCR products. In this way, contamination with
plasmid “pCR8-long F2” in the cloning steps can be avoided.
206
Yiping Qi et al.
9. In this protocol, ZFAs are first cloned into yeast expression
vectors pCP3 and pCP4, because a yeast assay is often performed in our lab to test ZFN activity before using them in
higher eukaryotes. Both pCP3 and pCP4 vectors are available
from the Voytas lab upon request. If a yeast assay is not performed, one can skip all of the remaining steps and go directly
to Subheading 3.2, where the PCR products (instead of pCP3
or pCP4) that contain the left and right ZFAs will be digested
with XbaI and BamHI.
10. This step uses homologous recombination in E. coli to incorporate both the left ZFA and right ZFA into pCP3 and pCP4
vectors. We find this approach simpler than performing regular ligation reactions.
11. The reason for using a linearized entry clone is that both the
pFZ87 entry clone and pMDC7 destination clone use a kanamycin selectable marker for E. coli. An alternative strategy
would be to clone the left-ZFN-T2A-right-ZFN sequence
into an entry clone that uses other antibiotic markers. In this
way, linearization of the entry clone becomes unnecessary.
12. The ZFNs that target the Arabidopsis ADH1 gene were made
with the OPEN platform [12]. However, the plant mutant
screening procedure described here is independent of the
ZFN engineering platform.
13. The timeline for growing plants to seed may vary in different
labs depending on the growth conditions.
14. We use a paint shaker to pulverize plant tissue and CTAB buffer for plant genomic DNA extraction. The advantage of using
a paint shaker to pulverize samples all at once becomes obvious
if there are multiple samples to be handled, especially when
performing screens for germline-transmitted mutations. Other
alternative plant DNA extraction methods should also work.
15. It is important to have a relatively unique restriction enzyme
recognition site right in the middle of the spacer where DSBs
are induced by ZFNs and mutations are made. Such mutations
are mainly point mutants, small deletions and insertions, which
will destroy the restriction enzyme recognition site. However,
if there is no suitable restriction enzyme to use, one should use
other detection methods such as the surveyor assay [30] or
high-throughput DNA sequencing.
16. It is possible that no clear evidence for mutagenesis will be
obtained for a given pair of ZFNs due to their low activity. If
this is the case, an enrichment PCR procedure can be used to
detect ZFN-induced mutations as illustrated in Fig. 4.
Alternatively, ZFN-induced mutations can be detected and
quantified by deep sequencing [31].
Tailor-Made Mutations in Arabidopsis Using Zinc Finger Nucleases
a
ZFN-treated cells
207
ZFN-untreated cells
Digestion of genomic DNA
b
PCR with digested genomic
DNA as template
c
Digestion of PCR product
d
PCR
product
PCR
product
Digested
product
Digested
product
uncut
Fig. 4 Enrichment PCR to detect ZFN-induced mutations (a) Both ZFN-treated
(left) and untreated (right) cells are depicted in parallel. NHEJ-mediated mutations are only present in the ZFN-treated cells, and mutations are denoted by a
black dot. (b) Digestion of genomic DNA by a restriction enzyme which is in close
proximity or overlaps the ZFN cut site. Some ZFN-induced mutations will destroy
such a restriction enzyme site, preventing the DNA from being cleaved (see the
black dot). (c) PCR amplification with primers flanking the ZFN site will enrich for
ZFN-mutated DNA. In ZFN-untreated cells, there will be some or no PCR product
depending on the completeness of the restriction digestion reaction. (d) The ZFNinduced mutations can be easily detected as an uncut band by restriction digestion of the PCR product with the same restriction enzyme
17. In general, the more T1 transgenic founder lines that are followed, the better are the chances to recover heritable mutants
in T2 progeny. Forty T1 transgenic lines, recommended in
this study, have been used as the population size in our laboratory for the initial screen of germline-transmitted mutants.
For ZFNs with strong cleavage activities, as revealed by
somatic mutagenesis assays, several independent heritable
mutants have been successfully identified from the progeny of
those 40 T1 plants. If the activity of the ZFNs is weak, more
T1 plants are recommended to be screened.
208
Yiping Qi et al.
18. We have tried spraying 20 μM Estradiol to the transgenic
plants after they were transferred from MS plates to soil. We
think such an additional estradiol treatment may help enhance
mutagenesis frequency.
19. It is not unusual to recover biallelic mutations when the ZFNs
are highly active [12]. If the ZFNs are not very active, as
revealed by their somatic mutagenesis frequencies in
Subheading 3.3.4, we recommend a strategy in which multiple plants (such as 20) are pooled in a single sample. Mutations
are then detected using the enrichment PCR procedure illustrated in Fig. 4.
Acknowledgments
This work is supported by grants from the National Science
Foundation to D.F.V. (DBI 0923827 and MCB 0209818).
References
1. Alonso JM, Ecker JR (2006) Moving forward
in reverse: genetic technologies to enable
genome-wide phenomic screens in Arabidopsis.
Nat Rev Genet 7:524–536
2. Alonso JM et al (2003) Genome-wide insertional mutagenesis of Arabidopsis thaliana.
Science 301:653–657
3. Sessions A et al (2002) A high-throughput
Arabidopsis reverse genetics system. Plant Cell
14:2985–2994
4. Woody ST et al (2007) The WiscDsLox
T-DNA collection: an arabidopsis community
resource generated by using an improved
high-throughput T-DNA sequencing pipeline.
J Plant Res 120:157–165
5. McCallum CM et al (2000) Targeted screening for induced mutations. Nat Biotechnol
18:455–457
6. Bush SM, Krysan PJ (2011) iTILLING: a personalized approach to the identification of
induced mutations in Arabidopsis. Plant
Physiol 154:25–35
7. Kim YG, Cha J, Chandrasegaran S (1996)
Hybrid restriction enzymes: zinc finger fusions
to Fok I cleavage domain. Proc Natl Acad Sci
U S A 93:1156–1160
8. Bibikova M et al (2001) Stimulation of homologous recombination through targeted cleavage by chimeric nucleases. Mol Cell Biol
21:289–297
9. Bitinaite J et al (1998) FokI dimerization is
required for DNA cleavage. Proc Natl Acad Sci
U S A 95:10570–10575
10. Townsend JA et al (2009) High-frequency
modification of plant genes using engineered
zinc-finger nucleases. Nature 459:442–445
11. Carroll D (2011) Genome engineering with
zinc-finger nucleases. Genetics 188:773–782
12. Zhang F et al (2010) High frequency targeted
mutagenesis in Arabidopsis thaliana using zinc
finger nucleases. Proc Natl Acad Sci U S A
107:12028–12033
13. Shukla VK et al (2009) Precise genome modification in the crop species Zea mays using
zinc-finger nucleases. Nature 459:437–441
14. Osakabe K, Osakabe Y, Toki S (2010) Sitedirected mutagenesis in Arabidopsis using
custom-designed zinc finger nucleases. Proc
Natl Acad Sci U S A 107:12034–12039
15. Doyon Y et al (2008) Heritable targeted gene
disruption in zebrafish using designed zincfinger nucleases. Nat Biotechnol 26:702–708
16. Kim S et al (2011) Preassembled zinc-finger
arrays for rapid construction of ZFNs. Nat
Methods 8:7
17. Ramirez CL et al (2008) Unexpected failure
rates for modular assembly of engineered zinc
fingers. Nat Methods 5:374–375
18. Maeder ML et al (2008) Rapid “open-source”
engineering of customized zinc-finger nucleases for highly efficient gene modification. Mol
Cell 31:294–301
19. Sander JD et al (2011) Selection-free zincfinger-nuclease engineering by contextdependent assembly (CoDA). Nat Methods
8:67–69
Tailor-Made Mutations in Arabidopsis Using Zinc Finger Nucleases
20. Curtin SJ et al (2011) Targeted mutagenesis of
duplicated genes in soybean with zinc-finger
nucleases. Plant Physiol 156:466–473
21. Sander JD, Maeder ML, Joung JK (2011)
Engineering designer nucleases with customized cleavage specificities. Curr Protoc Mol
Biol Chapter 12, Unit12 13. doi:
10.1002/0471142727.mb1213s96
22. Osborn MJ et al (2011) Synthetic zinc finger
nuclease design and rapid assembly. Hum
Gene Ther 22:1155–1165
23. Curtis MD, Grossniklaus U (2003) A gateway
cloning vector set for high-throughput functional analysis of genes in planta. Plant Physiol
133:462–469
24. Koncz C et al (1989) High-frequency T-DNAmediated gene tagging in plants. Proc Natl
Acad Sci U S A 86:8467–8471
25. Wright DA et al (2006) Standardized reagents
and protocols for engineering zinc finger
nucleases by modular assembly. Nat Protoc
1:1637–1652
209
26. Sander JD et al (2010) ZiFiT (zinc finger targeter): an updated zinc finger engineering
tool. Nucleic Acids Res 38:W462–W468
27. Clough SJ, Bent AF (1998) Floral dip: a simplified method for Agrobacterium-mediated
transformation of Arabidopsis thaliana. Plant J
16:735–743
28. Miller JC et al (2007) An improved zincfinger nuclease architecture for highly specific
genome editing. Nat Biotechnol 25:
778–785
29. Szymczak AL et al (2004) Correction of multigene deficiency in vivo using a single “selfcleaving” 2A peptide-based retroviral vector.
Nat Biotechnol 22:589–594
30. Guschin DY et al (2010) A rapid and general
assay for monitoring endogenous gene modification. Methods Mol Biol 649:247–256
31. Herrmann F et al (2011) p53 Gene repair with
zinc finger nucleases optimised by yeast
1-hybrid and validated by Solexa sequencing.
PLoS One 6:e20913
Chapter 11
The Use of Artificial MicroRNA Technology to Control Gene
Expression in Arabidopsis thaliana
Andrew L. Eamens, Marcus McHale, and Peter M. Waterhouse
Abstract
In plants, double-stranded RNA (dsRNA) is an effective trigger of RNA silencing, and several classes of
endogenous small RNA (sRNA), processed from dsRNA substrates by DICER-like (DCL) endonucleases,
are essential in controlling gene expression. One such sRNA class, the microRNAs (miRNAs) control the
expression of closely related genes to regulate all aspects of plant development, including the determination of leaf shape, leaf polarity, flowering time, and floral identity. A single miRNA sRNA silencing signal
is processed from a long precursor transcript of nonprotein-coding RNA, termed the primary miRNA
(pri-miRNA). A region of the pri-miRNA is partially self-complementary allowing the transcript to fold
back onto itself to form a stem–loop structure of imperfectly dsRNA. Artificial miRNA (amiRNA) technology
uses endogenous pri-miRNAs, in which the miRNA and miRNA* (passenger strand of the miRNA duplex)
sequences have been replaced with corresponding amiRNA/amiRNA* sequences that direct highly
efficient RNA silencing of the targeted gene. Here, we describe the rules for amiRNA design, as well as
outline the PCR and bacterial cloning procedures involved in the construction of an amiRNA plant expression
vector to control target gene expression in Arabidopsis thaliana.
Key words miRNA, amiRNA, RNA silencing, Plant expression vector, Target gene expression,
Arabidopsis
1
Introduction
The genome of the model dicotyledonous plant species Arabidopsis
thaliana (Arabidopsis) encodes several classes of highly abundant
small RNA (sRNA), 20–30 nucleotides (nt) in length [1–4]. These
small, single-stranded RNAs, through various protein-mediated
RNA–RNA and RNA–DNA interactions, regulate gene expression
in a highly sequence-specific manner in a diverse array of biological
processes, including all aspects of plant development, adaptation to
stress, and defense against transposon replication and invading
pathogens [5–8].
Small RNAs can direct their mechanism of RNA silencing at
either the transcriptional or posttranscriptional level of gene expression, and sRNAs functioning posttranscriptionally are divided into
Jose J. Sanchez-Serrano and Julio Salinas (eds.), Arabidopsis Protocols, Methods in Molecular Biology, vol. 1062,
DOI 10.1007/978-1-62703-580-4_11, © Springer Science+Business Media New York 2014
211
212
Andrew L. Eamens et al.
two distinct classes: small interfering RNAs (siRNAs) and microRNAs
(miRNAs), depending on their mode of biogenesis [9]. Both
classes of sRNA are processed from double-stranded RNA (dsRNA),
an effective trigger of RNA silencing, by members of the DICERlike (DCL) family of endonucleases [10–12]. DCL cleavage of perfectly dsRNA derived from the transcription of template RNAs,
repetitive DNA elements, transposons, natural antisense gene pairs,
invading viruses, or introduced transgene-encoded hairpin RNAs
(hpRNAs) generates various species of siRNA [3, 12–14]. miRNAs, on the other hand, result from DCL cleavage of dsRNA
stem–loop structures of self-complementary regions of nonproteincoding RNAs transcribed from MIR loci [1, 7, 10, 11].
Transformation of Arabidopsis and other plant species with
hpRNA transgenes, consisting of an inverted-repeat portion of the
target gene sequence separated by a fragment of spacer material
(usually an intron), has been demonstrated to be highly efficient
trigger of siRNA-directed RNA silencing [15, 16]. However, due
to the fact that any sequence along the length of the resulting
dsRNA molecule can be processed into a sRNA silencing signal,
questions of silencing specificity have arisen. Off-target silencing,
the silencing of transcripts additional to those of the targeted gene,
have been reported in both the plant and animal system with the
use of hpRNA-directed RNA silencing [17–19]. We [20–22], and
others [23–27], have therefore developed an alternate approach to
control target gene expression in Arabidopsis, termed artificial
miRNA (amiRNA) technology.
Artificial miRNA technology exploits the intrinsic nature of the
processing stages of the miRNA biogenesis pathway. Through
modification of both the miRNA and miRNA* sequences while
maintaining dsRNA structural features of a miRNA precursor transcript, such as bulges and mismatches, a single, specific, highly accumulating sRNA silencing signal can be generated. To date, amiRNA
technology has been demonstrated to direct highly efficient and specific RNA silencing of reporter genes [20, 23], endogenous genes
[24, 26], nonprotein-coding RNA [21], and viruses [25, 27], both
tissue specifically and in whole plants. Here we describe the design
rules for amiRNA selection as well as to outline the PCR and bacterial cloning procedures involved in the construction of an
amiRNA plant expression vector for the plasmid pBlueGreen.
2
Materials
2.1 Selection of the
Artificial MicroRNA
Target Sequence
1. Personal computer with Internet access.
2. Template for
(see Fig. 1a).
amiRNA
forward
and
reverse
primers
Controlling Gene Expression with Artificial MicroRNAs
213
Fig. 1 Template for amiRNA forward and reverse primer design. (a) Artificial miRNA forward and reverse primer
template sequences to maintain the endogenous dsRNA structural features of the Arabidopsis MIR159B
miR159b/miR159b* duplex in the modified PRI-MIR159B amiRNA precursor fragment. (b) Example of the
sequence composition of an amiRNA target sequence using the pBlueGreen plant expression vector system.
(c) The exact 21-nt amiRNA target sequence is entered into the amiRNA reverse primer template. This
sequence is also entered into the amiRNA forward primer template; however, the dsRNA mismatches of the
endogenous miR159b/miR159b* duplex are accounted for by introducing mismatched base pairings at positions 12, 13, and 21, respectively (represented by grey-colored lowercase template sequences)
2.2 PCR
Amplification of the
Artificial MicroRNA
Precursor Fragment
1. Template plasmid pAth-miR159b (see Note 1).
2. AmiRNA forward and reverse primers (10 μM; see Note 2).
3. dNTPs (5 mM each of dATP, dCTP, dGTP, and dTTP).
4. Expand Long Template Enzyme mix (5 U/μL; Roche Applied
Science; see Note 3).
214
Andrew L. Eamens et al.
5. 10× Expand Long Template PCR System buffer 1 (Roche
Applied Science).
6. DNase-free dH2O.
7. 1× Tris/Borate/EDTA (TBE) buffer (for agarose gel analysis).
8. 6× Loading dye (LD; MBI Fermentas).
9. 100 Base pair (bp) DNA ladder (MBI Fermentas).
10. 1.2 % w/v agarose gel (stained with ethidium bromide; EtBr).
11. Ice.
12. 0.2 mL PCR tubes.
13. Pipette tips (2, 20, and 200 μL).
14. 1.5 mL Microfuge tubes.
15. QIAquick® PCR Purification kit (Qiagen).
16. Benchtop thermocycler.
17. Benchtop microfuge (at room temperature; RT).
2.3 Cloning of the
PCR-Generated
Artificial MicroRNA
Precursor Fragment
into the pGEM-T® Easy
Cloning Vector
1. Column-purified
Subheading 3.2).
amiRNA
precursor
fragment
(from
2. pGEM-T® Easy Cloning vector (50 ng/μL; Promega).
3. 2× Rapid ligation buffer (Promega).
4. T4 DNA ligase (3 U/μL; Promega).
5. 20 % w/v 5-Bromo-4-chloro-3-indolyl-β-D-galactopyranoside
(X-gal; Sigma-Aldrich).
6. 0.1 M Isopropyl-β-D-1-thiogalactopyranoside
Sigma-Aldrich).
(IPTG;
7. Luria–Bertani (LB) liquid media.
8. LB liquid media containing 50 mg/mL ampicillin (LB-Amp50).
9. LB-Amp50 agar plates.
10. Escherichia coli DH5α electro-competent cells.
11. Ice.
12. Pipette tips (2, 20, 200, and 1,000 μL).
13. 1.5 mL Microfuge tubes.
14. 15 mL Capped centrifuge tubes.
15. Drawn-out glass pipette (with bulb).
16. Bacterial cell plate spreader (sterilized).
17. Bacterial cell loop (sterilized).
18. QIAprep® Spin Miniprep kit (Qiagen).
19. Electroporator and cuvettes.
20. Benchtop microfuge (at RT).
21. Incubated shaker (at 37 °C).
Controlling Gene Expression with Artificial MicroRNAs
215
22. Incubator (at 37 °C).
23. Water bath (at 37 and 65 °C).
24. Laminar flow cabinet.
2.4 Cloning of the
Artificial MicroRNA
Precursor Fragment
into the Plant
Expression Vector
pBlueGreen
1. pGEM-T: precursor-amiRNA vector (from Subheading 3.3).
2. pBlueGreen plant expression vector (see Note 1).
3. pSoup helper plasmid (see Note 1).
4. LguI (5 U/μL; MBI Fermentas).
5. BamHI (10 U/μL; MBI Fermentas).
6. T4 DNA ligase (5 U/μL; MBI Fermentas).
7. 10× Buffer Tango™ (MBI Fermentas).
8. 10x Buffer BamHI (MBI Fermentas).
9. 10× T4 DNA ligase buffer (MBI Fermentas).
10. 1× TBE buffer.
11. 6× LD.
12. DNase-free dH2O.
13. Ice.
14. 20 % w/v X-gal.
15. 0.1 M IPTG.
16. 1.0 % w/v agarose gel (stained with EtBr).
17. 1.2 % w/v agarose gel (stained with EtBr).
18. 100 bp DNA ladder.
19. LB liquid media.
20. LB agar plates containing 50 mg/mL kanamycin (LB-Kan50).
21. E. coli DH5α electro-competent cells.
22. Agrobacterium tumefaciens GV3101 electro-competent cells.
23. Bacterial cell plate spreader (sterilized).
24. Bacterial cell loop (sterilized).
25. QIAquick® PCR Purification kit.
26. QIAprep® Spin Miniprep kit.
27. Pipette tips (2, 20, 200, and 1000 μL).
28. 1.5 mL Microfuge tubes.
29. Electroporator and cuvettes.
30. Benchtop microfuge (at RT).
31. Benchtop shaker (at 37 °C).
32. Incubator (at 37 °C).
33. Water bath (at 37 °C and 65 °C).
34. Laminar flow cabinet.
216
3
Andrew L. Eamens et al.
Methods
3.1 Selection of the
Artificial MicroRNA
Target Sequence
1. To silence the expression of an Arabidopsis gene of interest
using amiRNA technology, download the cDNA sequence of
the target gene from the TAIR website (http://www.arabidopsis.org/), using the Gene Search function (http://www.
arabidopsis.org/servlets/Search?action=new_search&type=
gene), to your personal computer.
2. Working in a 5′–3′ direction, identify a 19-nucleotide (nt)
sequence within the target gene cDNA sequence containing
either a cytosine (C) or guanine (G) residue at position 1, a
thymine (T) residue at position 10, and an adenine (A) residue
at position 19 (see Note 4), as outlined in Fig. 1b.
3. Once a putative amiRNA target sequence matching these
parameters is identified, add two additional 5′ nucleotides
upstream of position 1 to obtain a 21-nt putative amiRNA
target sequence.
4. Using the BLAST search function on the TAIR website
(http://www.arabidopsis.org/Blast/index. jsp), determine if
the reverse complement of the selected 21-nt putative amiRNA
target sequence (this corresponds to the sequence of your
putative mature amiRNA guide strand) is specific to your gene
of interest (see Note 5).
5. If the putative amiRNA sequence is complementary to transcripts additional to the target gene, which are not to be targeted for amiRNA-directed RNA silencing, discard the
selected sequence and repeat above steps 2–4 until a 21-nt
putative amiRNA sequence unique to the target gene is identified (see Note 5).
6. Once a putative amiRNA sequence complementary only to
the target gene is identified, enter the exact 21-nt amiRNA
target sequence (the selected cDNA target sequence in the
5'–3' direction) into the amiRNA reverse primer template
(Fig. 1a), as outlined in Fig. 1c.
7. Also enter the selected 21-nt target sequence (again enter the
cDNA target sequence in the 5′–3′ direction) into the amiRNA
forward primer template (Fig. 1a). However, and as outlined
in Fig. 1c, introduce three mismatched nucleotides at positions 12, 13, and 21 of the amiRNA target sequence, respectively (see Note 6).
8. Order the 65-nt amiRNA forward and 61-nt amiRNA reverse
primers from your usual supplier of high-quality DNA oligonucleotides (see Note 7).
Controlling Gene Expression with Artificial MicroRNAs
3.2 PCR
Amplification of the
Artificial MicroRNA
Precursor Fragment
217
1. On ice, add reaction components to a chilled 0.2 mL PCR
tube in the following order: 38.0 μL of DNase-free dH2O,
5.0 μL of 10× Expand Long Template PCR System buffer 1,
1.0 μL of 5 mM dNTP mix, 2.5 μL each of 10 μM forward
and reverse amiRNA primer, 0.5 μL of the pAth-miR159b
template plasmid (~50 pg/μL), and 0.5 μL of Expand Long
Template Enzyme (see Note 3).
2. Mix the reaction components by pipetting, cap the PCR tube,
and immediately transfer to a benchtop thermocycler that has
been pre-warmed to 95 °C.
3. Amplify the precursor-amiRNA fragment from the pAthmiR159b template plasmid using the PCR program;
1 × 95 °C/3 min (min.); 28 × 94 °C/20 s (s), 56 °C/30 s,
72 °C/45 s; 1 × 72 °C/7 min., 16 °C/10 min.
4. Transfer a 10 μL aliquot of the above reaction to a labelled
1.5 mL microfuge tube containing 2.0 μL of 6× LD, mix by
pipetting, and visualize on an EtBr-stained 1.2 % w/v agarose
gel in 1× TBE buffer. On the same gel, run a 10 μL aliquot of
100 bp DNA ladder (or similar size marker) to check that the
amplified amiRNA precursor fragment is the correct size at
224 bp.
5. If the PCR product is the expected size, purify the remaining
40 μL using the QIAquick® PCR Purification kit according to
the manufacturer’s instructions and resuspend in 20 μL of
DNase-free dH2O.
3.3 Cloning of the
PCR-Generated
Artificial MicroRNA
Precursor Fragment
into the pGEM®-T Easy
Cloning Vector
1. On ice, add to a chilled, labelled 1.5 mL microfuge tube,
4.0 μL of the column-purified amiRNA precursor fragment
and 0.5 μL of the pGEM-T® Easy Cloning vector, mix by
pipetting, cap the tube, and incubate at 65 °C for 5 min., then
immediately transfer the reaction tube to ice and incubate for
an additional 5 min. (see Note 8).
2. Add 5.0 μL of 2× Rapid ligation buffer and 0.5 μL of T4 DNA
ligase, mix by pipetting, cap the tube, and incubate at 37 °C
for 60 min., in a water bath.
3. On ice, thaw an aliquot of E. coli DH5α electro-competent
cells. When the cells have completely thawed, add a 2.0 μL
aliquot of the above ligation reaction, gently mix by pipetting,
and immediately transfer the cellular mixture to a chilled
cuvette (on ice).
4. Transfer the cuvette to an electroporator, electroporate, and
immediately add 450 μL of ice-cold LB liquid media. Using a
drawn-out glass pipette, transfer the cellular mixture to a new
labelled 1.5 mL microfuge tube and incubate at 37 °C for
60 min, in a benchtop shaker (at 200 rpm).
218
Andrew L. Eamens et al.
5. In a laminar flow cabinet, and using a sterilized bacterial cell
plate spreader, evenly spread 10 μL of 0.1 M IPTG and 20 μL
of 20 % w/v X-gal over the entire surface of a LB-Amp50 agar
plate (see Note 9). Transfer a 50 μL aliquot of the above bacterial suspension onto the same plate, spread the suspension
evenly over the entire surface of the plate with a sterilized bacterial cell plate spreader and dry the plate in the laminar flow
cabinet for 10 min. Transfer the agar plate to a 37 °C incubator and incubate for 16–24 h.
6. Using a sterilized bacterial cell loop, select a single whitecolored colony (see Note 9) and inoculate a 5 mL LB-Amp50
liquid culture. Cap the 15 mL centrifuge tube and incubate
the bacterial culture at 37 °C for 16–24 h in a benchtop shaker
(at 200 rpm).
7. Isolate plasmid DNA from the overnight culture using the
QIAprep® Spin Miniprep kit (or by your usual plasmid DNA
isolation protocol) according to the manufacturer’s
instructions. The plasmid preparation contains the modified
precursor transcript of the Arabidopsis MIR159B gene, where
the endogenous miR159/miR159* sequences have been
replaced with amiRNA/amiRNA* sequences, in the pGEM-T®
Easy Cloning vector (see Note 10).
3.4 Cloning of the
Artificial MicroRNA
Precursor Fragment
into the Plant
Expression Vector
pBlueGreen
1. On ice, and in two labelled 1.5 mL microfuge tubes, add reaction components in the following order: 7.0 μL of DNase-free
dH2O, 2.0 μL of 10× Buffer TangoTM and 1.0 μL of 5 U/μL
LguI. Add 10.0 μL each of the pGEM-T:precursor-amiRNA
and pBlueGreen plasmid preparations to the appropriately
labelled 1.5 mL microfuge tube, mix by gently pipetting, and
incubate for 4 h in a 37 °C water bath.
2. Transfer a 5.0 μL aliquot of each digestion product to a new
labelled 1.5 mL microfuge tube containing 1.0 μL of 6x LD.
Mix by pipetting and visualize on an EtBr-stained 1.0 % w/v
agarose gel in 1× TBE buffer. On the same gel, run a 10 μL
aliquot of 1 kb DNA ladder (or similar size marker) to check
that; (a) the plasmid preparations are completely digested, and;
(b) the restriction fragments are the correct size (see Note 11).
3. Purify the LguI digestion products with the QIAquick® PCR
Purification kit according to the manufacturer’s instructions.
Resuspend each restriction product in 20 μL of DNase-free
dH2O.
4. To an appropriately labelled 1.5 mL microfuge tube, add
10.0 μL of the LguI-digested pGEM-T:precursor-amiRNA
vector, 0.5 μL of the LguI-digested pBlueGreen vector, and
7.0 μL of DNase-free dH2O. Mix by pipetting and incubate at
65 °C in a water bath for 5 min., then immediately transfer to
ice and incubate for an additional 5 min (see Note 8).
Controlling Gene Expression with Artificial MicroRNAs
219
5. On ice, add 2.0 μL of 10× T4 DNA Ligase buffer and 0.5 μL
of T4 DNA ligase, mix by pipetting, and incubate overnight at
RT (or alternatively, incubate the ligation at 37 °C for 4 h in a
water bath).
6. Thaw an aliquot of E. coli DH5α electro-competent cells on
ice, and once thawed, add a 2.0 μL aliquot of the above ligation reaction, gently mix by pipetting, and immediately transfer the mixture to a chilled cuvette (on ice).
7. Transfer the cuvette to an electroporator, electroporate, and
immediately add 450 μL of ice-cold LB liquid media. Using a
drawn-out glass pipette, transfer the cellular mixture to a new
labelled 1.5 mL microfuge tube and incubate at 37 °C for
60 min., in a benchtop shaker (at 200 rpm).
8. In a laminar flow cabinet, and using a sterilized bacterial cell
plate spreader, evenly spread 10 μL of 0.1 M IPTG and 20 μL
of 20 % w/v X-gal over the entire surface of a LB-Kan50 agar
plate (see Note 9). Transfer a 50 μL aliquot of the above bacterial suspension onto the same plate, spread the suspension
evenly over the entire surface of the plate with a sterilized bacterial cell plate spreader and dry the plate in the laminar flow
cabinet for 10 min. Transfer the agar plate to an incubator and
incubate at 37 °C for 16–24 h.
9. Using a sterilized bacterial cell loop, select a single whitecolored colony (see Note 9) and inoculate a 5 mL LB-Kan50
liquid culture. Cap the 15 mL centrifuge tube and incubate
the bacterial culture at 37 °C for 16–24 h in a benchtop shaker
(at 200 rpm).
10. Isolate plasmid DNA from the overnight culture using the
QIAprep® Spin Miniprep kit according to the manufacturer’s
instructions. The plasmid preparation contains the modified
precursor transcript of Arabidopsis MIR159B, where the
endogenous miR159/miR159* sequences have been replaced
with amiRNA/amiRNA* sequences targeting your gene of
interest for amiRNA-directed RNA silencing, in the
pBlueGreen plant expression vector.
11. To determine the orientation of the amiRNA precursor fragment in the pBlueGreen plant expression vector, set up a
20 μL BamHI digestion in a labelled 1.5 mL microfuge tube
as follows: 12.0 μL of DNase-free dH2O, 5.0 μL of plasmid
preparation, 2.0 μL of 10× Buffer BamHI, and 1.0 μL of
BamHI. Mix by pipetting and incubate at 37 °C for 2 h in a
water bath.
12. Add 4.0 μL of 6× LD to each plasmid preparation selected for
BamHI digestion, mix by pipetting, and run the digestion
product(s) on an EtBr-stained 1.2 % w/v agarose gel in 1×
TBE buffer along with 10 μL of 100 bp DNA ladder (or similar
220
Andrew L. Eamens et al.
size marker). pBlueGreen plasmid preparations containing the
modified amiRNA precursor fragment in the desired sense
(5′–3′) orientation will return a 440 bp BamHI restriction
fragment (see Notes 12 and 13).
13. For transformation of Arabidopsis plants, mix 1.0 μL of the
selected pBlueGreen amiRNA plant expression vector with
1.0 μL of the helper plasmid pSoup and use this mixture to
transform A. tumefaciens GV3101 electro-competent cells via
electroporation (see Notes 14 and 15).
4
Notes
1. The plasmid pAth-miR159b contains the pri-miRNA sequence
of the Arabidopsis MIR159B (AT1G18075) locus, PRIMIR159B, in the pGEM-T® Easy Cloning vector. This plasmid is available from the authors upon request and should be
diluted to a concentration of ~50 pg/μL prior to use as a template for PCR amplification of the amiRNA precursor fragment. The amiRNA plant expression vector pBlueGreen and
the helper plasmid pSoup are available from the authors upon
request.
2. The amiRNA forward and reverse primers span the miRNA
and miRNA* sequences of PRI-MIR159B. Such a design
allows for the simple exchange of these two endogenous sRNA
sequences with corresponding amiRNA guide and amiRNA*
passenger strand sequences in a single PCR reaction. These
long primers also encode LguI restriction sites required for
cloning of the modified amiRNA precursor fragment into the
pBlueGreen plant expression vector.
3. The Expand Long Template Enzyme mix (Roche Applied
Science) is used to amplify the amiRNA precursor fragment as
this system contains a mixture of two Taq DNA polymerases
to allow for (a) proofreading of the amplified product and (b)
A-tailing of the amplified product (for subsequent cloning
into the pGEM-T® Easy Cloning vector).
4. The putative amiRNA target sequence is identified by this
method due to the fact that target sequence positions 1, 10, and
19 correspond to mature amiRNA guide strand sequence positions 19, 10, and 1, respectively. The majority of endogenous
plant miRNAs, including Arabidopsis miR159b, express a uracil
(U) residue at the 5′ terminal base, and sRNAs with U at this
position preferentially associate with AGO1 [28]. Similarly, the
endonucleolytic activity of cleavage-competent AGO1 appears
to preferentially cleave mRNA substrates after an adenine (A)
residue [26]. Furthermore, we have demonstrated that in plants
[20], more stable dsRNA base pairing is preferred at amiRNA
Controlling Gene Expression with Artificial MicroRNAs
221
position 19 to ensure preferential loading of the amiRNA guide
strand over the corresponding amiRNA duplex strand, the
amiRNA* passenger strand, onto the AGO1-catalyzed RNAinduced silencing complex (RISC) to direct highly efficient
RNA silencing. Please note that if a putative amiRNA target
sequence with G/C, T, and A residues at positions 1, 10, and
19, respectively, cannot be identified within your target transcript cDNA sequence, select an alternate target sequence with
other residues at either position 10 or 19 while maintaining the
G/C requirement at position 1. This approach will ensure preferential loading of the amiRNA guide strand over the amiRNA*
passenger strand for loading into AGO1-catalyzed RISC [20].
5. In practice, we have found that for Arabidopsis genes, 21-nt
amiRNA guide strand sequences (the reverse complement of
the selected 21-nt amiRNA target sequence) specific to the
transcript to be targeted for amiRNA-directed RNA silencing
are readily identifiable. Similarly, if a small group of closely
related genes are to be targeted for amiRNA-directed RNA
silencing, we recommend selection of the “shared” 21-nt
amiRNA target sequence returning the lowest number of possible “off targets” following BLAST searches with the corresponding putative amiRNA sequences.
6. Three mismatched dsRNA base pairings are entered into the
design of the amiRNA forward primer (forward primer corresponds to the mature amiRNA* strand) to retain the endogenous dsRNA structure of the miR159b/miR159b* duplex in
the modified amiRNA precursor fragment. This ensures that
the modified amiRNA precursor fragment is still recognized
and subsequently processed by the endogenous protein
machinery of the Arabidopsis miRNA biogenesis pathway.
7. Due to the significantly greater length of the amiRNA forward
and reverse primers (65-nt and 61-nt, respectively) compared
to standard DNA oligonucleotides, we recommend that suppliers of higher-quality DNA oligonucleotides are sourced to
avoid the synthesis of error-prone sequences.
8. Following the linearization of DNA fragments, incubation of
the purified fragment at 65 °C for 5 min removes any secondary structure that may inhibit subsequent molecular manipulations (such as ligations). Immediately transferring the reaction
mixture to ice for an additional 5 min incubation period
ensures that nucleic acids stay in a denatured state.
9. Both the pGEM-T® Easy Cloning vector and the pBlueGreen
plant expression vector contain the LacZ gene where the PCRamplified, or LguI-digested, amiRNA precursor fragment is
inserted following ligation. This allows for the rapid visual
selection of bacterial colonies (white-colored colonies) harboring the inserted fragment for subsequent plasmid preparations.
222
Andrew L. Eamens et al.
10. We strongly recommend that all insert-positive plasmid preparations (amiRNA precursor fragment containing preparations)
are sequenced prior to proceeding past this stage of amiRNA
plant expression vector construction. Sequence alterations
could result in dsRNA structural changes that may in turn lead
to inefficient processing of the modified amiRNA precursor
transcript by the protein machinery of the Arabidopsis miRNA
biogenesis pathway. The pGEM-T® Easy Cloning vector
encodes both the M13 forward and reverse primer recognition sequences allowing for sequencing of inserted fragments
in either and/or both direction(s).
11. Once completely digested, the pGEM-T:precursor-amiRNA
vector will return LguI restriction fragments of 250 bp,
378 bp, and 3,000 bp, respectively. LguI digestion of
pBlueGreen yields two restriction fragments of 700 bp and
>10 kb respectively. Do not proceed with the use of any plasmid preparation in subsequent molecular manipulations that
produce LguI restriction fragments differing in size or number
to those listed here.
12. The LguI-digested amiRNA precursor fragment can insert
into the similarly digested pBlueGreen plant expression vector
in either the sense (5′–3′) or antisense (3′–5′) orientation.
BamHI can be used to orientate the amiRNA precursor fragment insert. Plasmid preparations containing the amiRNA
precursor fragment in the antisense orientation will return a
smaller 376 bp restriction fragment compared to those harboring the amiRNA precursor fragment in the desired sense
orientation (these return a 440 bp LguI restriction fragment).
Discard all plasmid preparations with the amiRNA precursor
transcript in the antisense orientation and continue screening
additional white-colored kanamycin-resistant colonies until a
plasmid preparation containing the insert in the desired sense
orientation is isolated.
13. At this stage (once a plasmid preparation of the pBlueGreen
plant expression vector containing a modified amiRNA precursor fragment in the sense orientation has been identified), the
modified PRI-MIR159B transcript or the entire promoter–
amiRNA precursor fragment–terminator cassette can be PCRamplified for transferral to a new plant expression vector of the
researcher’s choice for (a) tissue-specific expression, (b) staking
of multiple modified amiRNA precursor fragments (to direct
RNA silencing of multiple unrelated target genes), or (c) the
use of a different in planta selectable marker (selection of
Arabidopsis lines expressing the pBlueGreen plant expression
vector is outlined in 21). For PCR amplification of the modified
PRI-MIR159B transcript use primers pAMIR159B-F [5′-TCA
(N)X ACTAGTGATTTCACTTTTGTT-3′] and pAMIR159B-R
Controlling Gene Expression with Artificial MicroRNAs
223
[5′-TCA (N)X TTCGAACCCAGACACTTAAAC-3′]. For
PCR amplification of the entire promoter–amiRNA precursor
fragment–terminator cassette (Fig. 1c), use primers p35SP-F
[5′-TCA (N)X CTCGACGAATTAATTCCAATC-3′] and
pOCST-R [5′-TCA (N)X CTGCA GGTCCTGCTGAGCC
TC-3′]. For all four listed primers, “X” is the number of nucleotides (N) in the recognition sequence of the selected restriction
endonuclease(s).
14. In addition to Arabidopsis and in our experience, the
pBlueGreen plant expression vector also directs highly efficient amiRNA-mediated RNA silencing in rice (Oryza sativa),
tobacco (Nicotiana tobacum), N. benthamiana, and tomato
(Solanum lycopersicum).
15. In our experience, the severity of the phenotype displayed by
putative amiRNA transformant lines will range from mild to
severe. It is therefore important to perform additional molecular analyses on a number of independent transformant lines
(we suggest screening of at least ten putative transformant
plant lines) to identify transformants where the integrated
amiRNA plant expression vector is directing highly efficient
RNA silencing of the targeted gene. We suggest the following
analyses: (a) Southern blot, to identify single-copy lines; (b)
sRNA-specific Northern blot, to assess amiRNA accumulation
in single-copy lines; and (c) RT-PCR, qRT-PCR, or high
molecular weight Northern blot of amiRNA target gene
expression. Such an approach is especially important if the
gene targeted for amiRNA-directed RNA silencing is not
expected to result in the expression of a readily observable
developmental phenotype.
References
1. Reinhart BJ et al (2002) MicroRNAs in plants.
Genes Dev 16:1616–1626
2. Adenot X et al (2006) DRB4-dependent TAS3
trans-acting siRNAs control leaf morphology
through AGO7. Curr Biol 16:927–932
3. Borsani O et al (2005) Endogenous siRNAs
derived from a pair of natural cis-antisense
transcripts regulate salt tolerance in
Arabidopsis. Cell 123:1279–1291
4. Onodera Y et al (2005) Plant nuclear RNA
polymerase IV mediates siRNA and DNA
methylation-dependent heterochromatin formation. Cell 120:613–622
5. Pontes O et al (2006) The Arabidopsis
chromatin-modifying nuclear siRNA pathway
involves a nucleolar RNA processing center.
Cell 126:79–92
6. Boutet S et al (2003) Arabidopsis HEN1: a
genetic link between endogenous miRNA
7.
8.
9.
10.
11.
controlling development and siRNA controlling transgene silencing and virus resistance.
Curr Biol 13:843–848
Dunoyer P et al (2004) Probing the microRNA
and small interfering RNA pathways with
virus-encoded suppressors of RNA silencing.
Plant Cell 16:1235–1250
Sunkar R, Zhu JK (2004) Novel and stressregulated microRNAs and other small RNAs
from Arabidopsis. Plant Cell 16:2001–2019
Mallory AC, Vaucheret H (2006) Functions of
microRNAs and related small RNAs in plants.
Nat Genet 38:S31–S36
Park W et al (2002) CARPEL FACTORY, a
Dicer homolog, and HEN1, a novel protein,
act in microRNA metabolism in Arabidopsis
thaliana. Curr Biol 12:1484–1495
Golden TA et al (2002) SHORT
INTEGUMENTS1/SUSPENSOR1/CARPEL
224
12.
13.
14.
15.
16.
17.
18.
19.
Andrew L. Eamens et al.
FACTORY, a Dicer homolog, is a maternal
effect gene required for embryo development in Arabidopsis. Plant Physiol 130:
808–822
Gasciolli V et al (2005) Partially redundant
functions of Arabidopsis DICER-like enzymes
and a role for DCL4 in producing trans-acting
siRNAs. Curr Biol 15:1494–1500
Xie Z et al (2005) DICER-LIKE4 functions in
trans-acting small interfering RNA biogenesis
and vegetative phase change in Arabidopsis
thaliana. Proc Natl Acad Sci U S A
102:12984–12989
Xie Z et al (2004) Genetic and functional
diversification of small RNA pathways in
plants. PLoS Biol 2:e104
Smith NA et al (2000) Total silencing by
intron-spliced hairpin RNAs. Nature 407:
319–320
Stoutjesdijk PA et al (2004) hpRNA-mediated
targeting of the Arabidopsis FAD2 gene gives
highly efficient and stable silencing. Plant
Physiol 129:1723–1731
Jackson AL, Linsley PS (2004) Noise amidst
the silence: off-target effects of siRNAs?
Trends Genet 20:521–524
Xu P et al (2006) Computational estimation
and experimental verification of off-target
silencing during posttranscriptional gene
silencing in plants. Plant Physiol 142:
429–440
Senthil-Kumar M, Mysore KS (2011) Caveat
of RNAi in plants: the off-target effect.
Methods Mol Biol 744:13–25
20. Eamens AL et al (2009) The Arabidopsis thaliana double-stranded RNA binding protein
DRB1 directs guide strand selection from
microRNA duplexes. RNA 15:2219–2235
21. Eamens AL et al (2011) Efficient silencing of
endogenous microRNAs using artificial
microRNAs in Arabidopsis thaliana. Mol Plant
4:157–170
22. Eamens AL, Waterhouse PM (2011) Vectors
and methods for hairpin RNA and artificial
microRNA-mediated gene silencing in plants.
Methods Mol Biol 701:179–197
23. Parizotto EA et al (2004) In vivo investigation
of the transcription, processing, endonucleolytic activity, and functional relevance of the
spatial distribution of a plant miRNA. Genes
Dev 18:2237–2242
24. Alvarez JP et al (2006) Endogenous and synthetic microRNAs stimulate simultaneous, efficient, and localized regulation of multiple targets
in diverse species. Plant Cell 18:1134–1151
25. Niu QW et al (2006) Expression of artificial
microRNAs in transgenic Arabidopsis thaliana
confers virus resistance. Nat Biotechnol 24:
1420–1428
26. Schwab R et al (2006) Highly specific gene
silencing by artificial microRNAs in
Arabidopsis. Plant Cell 18:1121–1133
27. Qu J, Ye J, Fang R (2007) Artificial miRNAmediated virus resistance in plants. J Virol
81:6690–6699
28. Mi S et al (2008) Sorting of small RNAs into
Arabidopsis Argonaute complexes is directed
by the 5′ terminal nucleotide. Cell 133:1–12
Chapter 12
Generation and Identification of Arabidopsis EMS Mutants
Li-Jia Qu and Genji Qin
Abstract
EMS mutant analysis is a routine experiment to identify new players in a specific biological process or
signaling pathway using forward genetics. It begins with the generation of mutants by treating Arabidopsis
seeds with EMS. A mutant with a phenotype of interest (mpi) is obtained by screening plants of the M2
generation under a specific condition. Once the phenotype of the mpi is confirmed in the next generation,
map-based cloning is performed to locate the mpi mutation. During the map-based cloning, mpi plants
(Arabidopsis Columbia-0 (Col-0) ecotype background) are first crossed with Arabidopsis Landsberg erecta
(Ler) ecotype, and the presence or absence of the phenotype in the F1 hybrids indicates whether the mpi
is recessive or dominant. F2 plants with phenotypes similar to the mpi, if the mpi is recessive, or those
without the phenotype, if the mpi is dominant, are used as the mapping population. As few as 24 such
plants are selected for rough mapping. After finding one marker (MA) linked to the mpi locus or mutant
phenotype, more markers near MA are tested to identify recombinants. The recombinants indicate the
interval in which the mpi is located. Additional recombinants and molecular markers are then required to
narrow down the interval. This is an iterative process of narrowing down the mapping interval until no
further recombinants or molecular markers are available. The genes in the mapping interval are then
sequenced to look for the mutation. In the last step, the wild-type or mutated gene is cloned to generate
binary constructs. Complementation or recapitulation provides the most convincing evidence in determining the mutation that causes the phenotype of the mpi. Here, we describe the procedures for generating
mutants with EMS and analyzing EMS mutations by map-based cloning.
Key words Arabidopsis, EMS mutagenesis, Forward genetics, Map-based cloning, F2 mapping
population, Molecular marker
1
Introduction
Forward genetics has proven to be a powerful tool for identifying
the components of a specific biological process or a signal transduction pathway [1]. One of the big advantages of forward genetics is
that we do not need prior assumptions and no bias is introduced.
Forward genetics starts with a mutant with a phenotype of interest
(mpi) [1]. By identifying mutants, we may find new components in
the biological process we are interested in. T-DNA insertion mutants
and mutants induced by chemical mutagens such as ethyl methanesulfonate (EMS) are the most widely used in forward genetics [2].
Jose J. Sanchez-Serrano and Julio Salinas (eds.), Arabidopsis Protocols, Methods in Molecular Biology, vol. 1062,
DOI 10.1007/978-1-62703-580-4_12, © Springer Science+Business Media New York 2014
225
226
Li-Jia Qu and Genji Qin
Compared with T-DNA insertion mutants, EMS mutants have
certain advantages. First, EMS mutants are easier to generate than
T-DNA mutants. Second, large amounts of EMS mutant seeds are
available for screening under a specific condition. Third, EMS may
produce a missense mutation resulting in a weak allele for an essential gene [3]. By analyzing EMS mutants, we can not only identify
gene functions but also understand the role of a specific amino acid
in protein function.
EMS may induce biased alkylation of guanine (G) to form
O6-ethylguanine, which pairs with thymine (T) but not with
cytosine (C). During the subsequent DNA repair, the original
G/C pair is thus replaced by A (adenine)/T. Thus, 99 % of EMS
mutations are C-to-T changes causing C/G to T/A substitutions
[3, 4]. To saturate the Arabidopsis genome with EMS mutations,
about 125,000 seeds of Arabidopsis (≈2.5 g) are required for
mutagenesis [5]. However, since EMS causes multiple point mutations in each plant, as few as 5,000 plants are enough to find a
mutation in a given gene [6].
After obtaining sufficient M2 seeds, one can screen these seeds
under a specific condition to find a mutant involved in a specific
biological process [1, 7]. Once a mutant of mpi is obtained, the
phenotype should be confirmed in the next generation. If the phenotype is verified, the mpi is crossed with Arabidopsis Ler ecotype
and the F1 plants are grown to generate F2 seeds. At the same
time, the observation of the phenotype in the F1 generation indicates whether the mpi is dominant or recessive. This information is
important to determine what kind of plants to select in the F2
generation for mapping.
A high density of molecular markers is essential for highresolution mapping [6]. Arabidopsis ecotypes including Col-0 and
Ler show abundantly divergent sequences that support the design
of highly dense molecular markers [6, 8]. The combination of Col-0
and Ler is the most widely used for mapping [6, 9]. The sequences
of these two ecotypes are available in public databases, which further
facilitate the design of molecular markers [9]. The most commonly
used molecular markers in Arabidopsis mapping are insertion/
deletion (InDel) markers based on simple sequence length polymorphisms (SSLP), cleaved amplified polymorphic sequences
(CAPS) markers, and derived CAPS (dCAPS) markers based on
single nucleotide polymorphisms (SNP) [10, 11]. These are all
PCR-based markers and thus easy to use and affordable. Many
InDel markers have been developed by different research groups, so
little effort is required to design of molecular markers in the postgenome era [6, 9–11].
Mapping the mutation includes rough-mapping and finemapping stages [6, 9]. Both processes actually involve similar
procedures, including the following steps: (1) Growing F2 plants.
(2) Observing phenotypes, that is, finding plants with the phenotype if the mpi is recessive or plants without the phenotype if the
Generation and Identification of EMS Mutants
227
mpi is dominant. (3) Finding or designing molecular markers. (4)
Testing these molecular markers. (5) Finding recombinants for the
markers. (6) Determining the mpi mapping interval. When no further markers or recombinants are available in a mapping interval,
the major work is diverted to sequencing the genes within the
interval until the mutation is found. Complementation or recapitulation is then required to confirm that the identified mutation
indeed causes the phenotype of the mpi.
In this chapter, we describe in detail three procedures used in
our lab. The first is the generation of mutants with the EMS
mutagen. The second is how to map and isolate the mutation that
leads to the phenotype of interest in the mpi. The third is complementation and recapitulation. Some steps of these procedures are
fine-tuned and described in the Notes for this section.
2
Materials
2.1 EMS
Mutagenesis of
Arabidopsis Seeds
1. 2.5 g of Arabidopsis seeds (about 125,000 seeds) [3].
2. Freshly made 20 % bleach.
3. Ethyl methanesulfonate (EMS) stock solution (Sigma M0880).
4. 10 M NaOH.
5. Solid MS medium or 0.1 % agar.
6. Sterilized water.
7. Disposable 50 mL plastic tubes.
8. Micropipette.
9. Parafilm.
10. Rotator.
11. Fume hood.
2.2 Mapping
of the mpi Locus
1. MS medium.
2.2.1 Preparation of the
Mapping Population
3. Seeds of the mpi Arabidopsis in ecotype Col-0.
2. Freshly made 20 % bleach.
4. Seeds of Arabidopsis ecotype Ler.
5. Micropipette.
6. 1.5 mL sterilized microcentrifuge tubes and tips.
7. Forceps and scissors.
8. Dissecting microscope.
9. Labeling tape.
2.2.2 DNA Preparation
Using CTAB
1. CTAB buffer: 2 % (w/v) cetyltrimethylammonium bromide
(CTAB), 100 mM Tris, 20 mM EDTA, and 1.4 M NaCl
(see Note 1) [12].
228
Li-Jia Qu and Genji Qin
2. Absolute ethanol and 70 % ethanol (prechilled in a −20 °C
freezer).
3. Chloroform/isoamyl alcohol (24:1).
4. Sterilized ddH2O.
5. 1 % agarose gel, 6× Loading Buffer and 1× TAE buffer.
6. Liquid nitrogen.
7. Sterile 1.5 mL microcentrifuge tubes and tips.
8. Plastic tissue grinding pestles.
9. Micropipette.
10. 65 °C water bath.
11. Microcentrifuge.
12. Vortex mixer.
2.2.3 Rough Mapping
of the mpi Locus
1. Labels.
2. PCR machine (thermocycler).
3. Sterilized 1.5 mL microcentrifuge tubes.
4. Sterilized PCR plates.
5. PCR reagents including PCR buffer, 2.5 mM dNTPs mixture,
marker primers, Taq DNA polymerase, and sterilized ddH2O.
6. 4 % agarose gel.
7. Agarose gel electrophoresis system.
2.2.4 Fine Mapping
of the mpi Locus
1. Labels.
2. PCR machine (thermocycler).
3. Sterilized 1.5 mL microcentrifuge tubes.
4. Sterilized PCR plates.
5. PCR reagents including PCR buffer, 2.5 mM dNTPs mixture,
marker primers, Taq DNA polymerase, and sterilized ddH2O.
6. Specific restriction endonuclease for CAPS marker.
7. Microcentrifuge.
8. A computer connected to the internet.
9. Primer design software.
10. Incubator.
11. 4 % agarose gel.
12. Agarose gel electrophoresis system.
2.3 Complementation and
Recapitulation
Analysis
1. Plasmid DNA of a plant binary vector containing the CaMV
35S promoter.
2. Competent cells of Agrobacterium tumefaciens strain GV3101
(pMP90).
Generation and Identification of EMS Mutants
229
3. LB broth and agar plates with antibiotics.
4. MS medium, sucrose, Silwet L-77.
5. Selective antibiotics or herbicides, carbenicillin.
6. Sterilized 50 and 1,000 mL flasks.
7. Sterilized 500 mL centrifuge bottles.
8. 28 °C incubator and shaker.
9. MicroPulser™ Electroporation Apparatus (Bio-Rad) or other
electroporator.
10. Ice-cold water bath.
11. Micropipette, microcentrifuge tubes and tips.
12. Microcentrifuge.
13. Silica-gel desiccant.
3
Methods
3.1 EMS
Mutagenesis of
Arabidopsis Seeds
1. Weigh out 2.5 g of Arabidopsis Col-0 ecotype seeds and put
them into one of the 50 mL disposable plastic tubes (see Note 2).
2. Make 20 % bleach with the sterilized water and add 40 mL
into the tube. Seal the tube with parafilm and rotate for
10–15 min on the rotator. Spin the tube briefly and remove
the bleach solution.
3. Wash the seeds with sterilized water 3–4 times. Add 40 mL
sterilized water. Seal the tube with parafilm and place on the
rotator. Keep rotating overnight at room temperature.
4. Add 120 μL EMS stock solution into the tube to make the
EMS to a final concentration of 0.3 %. Continue to rotate the
tube for about 12 h in a fume hood at room temperature (see
Note 3).
5. Remove the EMS solution to a container. Add 4 mL 10 M
NaOH and leave it at room temperature overnight (see Note 4).
6. Wash the seeds eight or more times with sterilized water. Spin
briefly each time to precipitate the seeds and dispose of the water.
7. Plate the seeds on MS medium or mix the seeds with 0.1 %
agar and pipette the mixture of plant seeds into soil. We grow
the plants in trays (see Note 5).
8. Harvest seeds and screen the M2 bulked seeds for the mpi
under a specific condition (see Note 6).
3.2 Mapping of
the mpi Locus
1. Harvest the seeds from the mpi and put them into a container
with silica-gel desiccant.
3.2.1 Preparation of the
Mapping Population
2. To generate the mapping population, first allocate about 150
seeds of the mpi (Col-0 background) and Arabidopsis ecotype
230
Li-Jia Qu and Genji Qin
Ler. Put the seeds separately into two 1.5 mL sterilized
microcentrifuge tubes. Add 1.4 mL freshly made 20 % bleach
to the tubes and mix for 10–15 min (see Note 7).
3. Wash the seeds 3–4 times with sterilized water and plate the
seeds onto MS medium. After being synchronized for 3 days
at 4 °C, keep the plates at 22 °C under long-day conditions
(16 h of light/8 h of darkness) for 7 days.
4. Transfer the mpi and Ler seedlings into soil and let them grow
at 22 °C under long-day conditions.
5. At the flowering stage, select a healthy inflorescence from the
mpi or Ler plants. Remove the siliques and the opened flowers
with scissors and get rid of the small buds with forceps. Just
keep 1–4 big buds on the inflorescence (see Note 8).
6. Remove the six anthers from the flower buds using the tips of
the forceps very carefully. Mark the inflorescence using colored labeling tape. Put the plant back into the normal growth
conditions (see Note 9).
7. Two days after emasculation, remove an opened flower from
the mpi or Ler plants with the forceps. Carefully rub the stigma
of the emasculated flower from the Ler or mpi plants against
the isolated flower in which mature pollen has been released
from the broken anthers. Label the time of pollination on the
tape.
8. Harvest the F1 seeds and dry them with silica-gel desiccant
(see Note 10).
9. Grow the F1 seeds as described above.
10. Observe the phenotype of the heterozygous mpi F1 plants
to determine whether the mutation is recessive or dominant
(see Note 11).
11. Harvest the leaves from the F1 plants and prepare DNA as
described below. Preserve the DNA at −20 °C for mapping as
described below.
12. To make sure no mistake was made during the cross and that
the F1 plants are in fact hybrids of Col-0 and Ler, perform
PCR using 1 μL DNA from the F1 plants as the template
and test it with two InDel markers as described below
(see Note 12).
13. Harvest the F2 seeds from the correct F1 plants individually.
Dry and preserve the seeds with silica-gel desiccant at 4 °C.
14. Grow the F2 seeds normally as described above. These F2
plants constitute the population to be used for mapping the
mpi locus (see Note 13).
Generation and Identification of EMS Mutants
3.2.2 DNA Preparation
Using CTAB
231
1. Observe the phenotypes of the plants in the F2 segregated
mapping population. For DNA preparation, select plants with
the phenotype of interest if the mpi is a recessive mutant or
those without the phenotype of interest if the mpi is a dominant mutant.
2. Harvest about 50–100 mg leaves (or one medium size leaf)
and place into a 1.5 mL microcentrifuge tube. Grind the tissue
to a fine powder in liquid nitrogen using a plastic tissue grinding pestle.
3. Add 400 μL 65 °C preheated 2 % CTAB extraction buffer and
mix well using the pestle.
4. Incubate the microcentrifuge tube in a 65 °C water bath for
10 min to 2 h. Mix every 10–30 min.
5. Add 400 μL of chloroform/ isoamyl alcohol (24:1) and vortex
the solution vigorously.
6. Centrifuge at 11,340 × g for 10 min at room temperature.
7. Transfer about 300 μL of the upper aqueous phase carefully to
a new tube (see Note 14).
8. Add 600 μL −20 °C prechilled absolute ethanol to the tube
and mix well by inverting. Place the tube in a −20 °C icebox
for at least 30 min (see Note 15).
9. Centrifuge at 11,340 × g for 10 min. Discard the supernatant.
10. Add 500 μL −20 °C prechilled 70 % ethanol to wash the DNA
pellet for 5–10 min.
11. Centrifuge at11,340 × g for 10 min. Discard the supernatant
carefully (see Note 16).
12. Dry the DNA pellet by inverting the tube on a paper towel
(see Note 17).
13. Add 100–200 μL sterilized ddH2O to dissolve the DNA.
3.2.3 Rough Mapping
of the mpi Locus
1. Observe the phenotypes of the plants from the F2 mapping
population. Calculate the segregation rates of the phenotype
of interest. The segregation ratio of the phenotype for a recessive mpi should be 3:1, whereas the segregation ratio for a
dominant mpi should be 1:3 (see Note 18).
2. Choose about 24 plants with the phenotype of interest if the
mpi is a recessive mutant or without the phenotype of interest
if the mpi is a dominant mutant for rough mapping. Number
the 24 plants and prepare DNA from these plants as described
above.
3. Select 10 InDel markers distributed on the ten Arabidopsis
chromosome arms for rough mapping (see Note 19).
232
Li-Jia Qu and Genji Qin
4. Perform primary reactions in total volumes of 10 μL. First, make
up the master mixtures for the PCR for each marker in sterile
1.5 mL microcentrifuge tubes. Write the marker’s names on
the tubes. Each reaction contains the following reagents for preparing the mixture: 1 μL of 10× PCR buffer with MgCl2, 0.8 μL
of 2.5 mM dNTPs mixture, 0.1 μL of each of the 10 μM marker
primer pair, 0.5 U of Taq DNA polymerase, and ddH2O to
make up to 10 μL. Briefly mix and centrifuge.
5. Allocate 9 μL PCR master mixture to each well of the PCR
plate. Write down the marker’s name on the plate. Add 1 μL
of DNA each from the Col-0 and Ler F1 hybrid and the
selected 24 plants to separate wells of the PCR plate. Remember
the order of the samples in the wells. Put the PCR plate on ice.
6. Set up the thermocycling program for PCR as follows: 94 °C
for 2 min, 45 cycles of (94 °C for 10 s, 58 °C for 15 s, 72 °C
for 30 s), and 72 °C for 5 min (see Note 20).
7. Place the PCR plate into the block and run the program on
the thermocycler.
8. After it has finished, electrophorese the PCR products on the
4 % agarose gel or store them at −20 °C if not doing this
immediately (see Note 21).
9. Check the success of the PCR and gel electrophoresis by
observing the separated DNA bands from the control Col-0
and Ler F1 hybrid.
10. Calculate the segregation ratios of the markers among the 24
plants. If the segregation ratio of a particular marker is about
1:2:1 (plants from which only the Ler band is amplified/plants
from which both bands are amplified/plants from which only
the Col-0 band is amplified), we can conclude that the marker
is not linked to the mpi locus. If there is a distortion of the
1:2:1 segregation ratio, the marker is possibly linked to the
mpi locus. We name the linked marker “MA” (Fig. 1).
11. Choose another marker (named MB) near marker MA. The
distance between markers MA and MB is about 3–5 BAC in
length (Fig. 1). Perform PCR as described above and calculate
the segregation ratio of marker MB. A distorted segregation
ratio of marker B confirms that the mpi locus is linked to
markers MA and MB. Calculate the number of recombinants
including plants from which only a Col-0 or Ler band is amplified and those from which both bands are amplified. The
number of recombinants demonstrates how closely the marker
is linked to the mpi locus; the fewer the number, the closer the
marker (see Note 22).
12. Taking markers MA and MB as the center, select another 4–8
markers (e.g., MC, MD, ME, MF) distributed evenly at both
ends of the center. The number of markers is selected according
Generation and Identification of EMS Mutants
233
Fig. 1 Schematic representation of the molecular markers distributed in one chromosome region. Left panel,
distribution of markers used in rough mapping, with the recombinants listed beside the markers. The mapping
interval is the region between marker MA and MC (left). Middle panel, distribution of markers used in the first
round of fine mapping. The mapping interval is the region between marker MAx and MCx (middle). Right panel,
the distribution of markers used in the second round of fine mapping. The mapping interval is the region
between marker MAxx and MCxx (right)
to how closely MA and MB are linked to the mpi locus. Perform
PCR again as described above. Calculate the number of recombinants for the additional markers MC, MD, ME, and MF, for
example (Fig. 1).
13. Draw a chromosome fragment representation and arrange the
tested markers in the chromosome region. Write the numbers
234
Li-Jia Qu and Genji Qin
of the recombinants alongside the marker names (Fig. 1).
Locate the mpi locus between two markers; for instance, MA
and MC, which are used as examples in the following section
(see Note 23).
3.2.4 Fine Mapping
of the mpi Locus
1. Keep the DNA of recombinants for markers MA and MC
(see Note 24).
2. Identify about 150 plants for further mapping that display the
phenotype of interest if the mpi is a recessive mutant or those
without the phenotype of interest if the mpi is a dominant
mutant. Number these plants and prepare DNA as described
above (see Note 25).
3. Use markers MA and MC to screen these plants for recombinants. Select recombinants for MA and MC for further analysis
(see Note 26).
4. Design markers between MA and MC. We name them MA1,
MA2, etc., up to MAx and MC1, MC2, etc., up to MCx. The
distribution of the markers should be dense in the middle and
sparse near either MA or MC (Fig. 1).
5. Use the DNA of the recombinants for MA to test MA1, MA2,
etc., until no recombinants are found when testing MAx. Use
the DNA of the recombinants for MC to test MC1, MC2,
etc., until no recombinants are found when testing MCx. The
mpi locus is now narrowed to between MAx and MCx.
6. If not many genes are located between MAx and MCx (Fig. 1),
perform bioinformatic analysis of these genes to find candidates for MPI. Sequence the candidate genes to look for mutations (see Note 27).
7. At the same time, continue to identify more plants displaying the
phenotype of interest if the mpi is a recessive mutant or those
without the phenotype of interest if the mpi is a dominant mutant
to further narrow down the interval in which the mpi locus is
located. Number these plants and prepare DNA from them.
8. Screen the recombinants using the markers MAx and MCx for
further analysis (see Note 28).
9. Design markers between MAx and MCx. We name them
MAx1, MAx2, etc., up to MAxx and MCx1, MCx2, etc., up
to MCxx (Fig. 1).
10. Use these markers to find recombinants until no recombinants
are found by MAxx and MCxx. The mpi locus interval is now
narrowed to between MAxx and MCxx (Fig. 1).
11. If only a few genes are located between MAxx and MCxx
(Fig. 1), the mpi locus interval may be hard to further narrow
down (see Note 29).
12. Sequence all genes between MAxx and MCxx to find the
mutation in the mpi mutant (see Note 30).
Generation and Identification of EMS Mutants
3.3 Complementation and
Recapitulation
Analysis
235
1. Clone the wild-type gene corresponding to the mutated gene
if the mpi is a recessive mutant or clone the mutated gene if
the mpi is a dominant mutant. Generate binary constructs in
which the wild-type or mutated gene is driven by the CaMV
35S promoter (see Note 31).
2. Prepare A. tumefaciens strain GV3101 harboring the plasmid
construct. Transform the wild-type gene into the mpi mutants
for complementation if the mpi is a recessive mutant or transform the mutated gene into wild-type plants if mpi is a dominant one.
3. Harvest the T0 seeds. Screen for transformants on 1/2 MS
containing proper selection antibiotic or herbicide.
4. Observe the phenotype of the transformants of the T1
generation. In complement analysis, if the transformants
recover a phenotype similar to that of the wild type, the mutant
is complemented by the wild-type gene. In recapitulation
analysis, if the transformants display a phenotype similar to
that of the mutant, the dominant mutant phenotype is caused
by the mutation (see Note 32).
4
Notes
1. To prepare 100 mL 2 % CTAB extraction buffer, add 2 g
CTAB, 10 mL 1 M Tris–HCl pH 8.0, 4 mL 0.5 M EDTA pH
8.0, and 8.19 g NaCl and add water to a final volume of
100 mL. Sterilize and store at room temperature. Preheat at
65 °C and add 0.2–0.5 % β-mercaptoethanol before use.
2. As few as 5,000 seeds can be used for mutagenesis since EMS
solution causes multiple point mutations in each plant [9]. We
usually treat 125,000 seeds at a time.
3. The EMS concentration may be modified in the range of 0.1–
0.3 %. Higher EMS concentrations may lead to higher rates of
mutation, but at the same time cause more lethal mutations.
EMS is a hazardous chemical, so the EMS mutagenesis should
be carried out in a fume hood.
4. The EMS solution is hazardous. After mutagenesis, EMS solution must be deactivated with NaOH. The NaOH-treated
EMS solution should be disposed of down the fume hood sink
regularly.
5. The trays are placed at 22 °C under 16 h light and 8 h dark.
If fewer seeds were mutated, the seeds may be plated on MS
medium. The green seedlings are then transferred to soil.
6. About 3–5 g of M2 seeds may be harvested from one tray.
If fewer seeds were treated, the seeds may be harvested individually. The M2 seeds can be stored at room temperature for
236
Li-Jia Qu and Genji Qin
up to 1 year with silica-gel desiccant. For long-term storage,
the seeds may be stored at 4 °C with silica-gel desiccant.
7. Different Arabidopsis ecotypes are crossed to make the F2
mapping population because their sequences are divergent in
nature [6]. Both Col-0 and Ler genomes have been sequenced
and are available on the internet. The Col-0 sequence is accessible on the NCBI website. The Ler sequence is accessible on
the TAIR website (http://www.arabidopsis.org/browse/
Cereon/index.jsp). Col-0 and Ler genomes have abundantly
divergent sequences that differ in about 4–11 positions every
1 kb [8, 9]. This abundance of sequence differences facilitates
the design of sufficiently dense molecular markers for mapping.
Thus, we usually cross Col-0 with Ler for the generation of
the mapping population.
8. Select flower buds as big as possible for crossing as long as the
anthers are not broken to release mature pollen. Flower buds
that are too small are hard to manipulate. Removing the
siliques and opened flowers prevents interference with the
crossed siliques and also lets more nutrients flow to the crossed
seeds.
9. If using Ler as the mother plant, remove the six anthers from
the flower bud directly under a dissecting microscope because,
in this ecotype, the anthers are not enclosed by petals and
sepals. If using the mpi (Col-0 background) as the mother
plant, because the petals and sepals enclose the anthers, first
press the tip of flower bud to open it and then get rid of the
six anthers. Be careful not to damage the gynoecium.
10. Usually, the seeds can be harvested 2 weeks after pollination.
11. Determining whether the mpi is recessive or dominant is critical to the next step of mapping. In the F1 generation, if the
heterozygous plants display phenotypes similar to the mpi, it is
dominant. If the heterozygous F1 plants display no phenotype, it is recessive.
12. DNA from F1 plants can also be used as a control in testing
the mapping markers with PCR. At least two bands are amplified from the DNA of F1 plants. When running the PCR
products, we just load the amplified products from the DNA
of F1 plants instead of DNA size markers during mapping.
13. The recombination rates vary in different regions of the
genome. In Arabidopsis, 1 cM equates to a physical distance of
100–400 kb, with an average of 250 kb [6]. However, in the
centromere region, 1 cM equates to about 1,000–2,500 kb
[13]. Therefore, it is hard to determine how many F2 plants
should be grown for mapping. It becomes a balance between
time and labor. If time is critical in determining the mpi
locus, we need to grow enough plants (2,000–4,000 plants)
Generation and Identification of EMS Mutants
237
for mapping to save time. If labor and space are a problem,
we may grow about 600 plants first to save labor and space.
14. Contamination by traces of chloroform affects the PCR. To
avoid contamination, dispose of about 100 μL of the aqueous
phase and transfer only 300 μL to the new tube.
15. It is not necessary to add sodium acetate to the aqueous phase
before using ethanol to precipitate the DNA because of the
NaCl in the CTAB extraction buffer.
16. Discard the supernatant gently to prevent losing the DNA
pellet.
17. Do not let the DNA pellet overdry. Overdried DNA pellets
are hard to dissolve.
18. Only if the segregation ratio meets Mendel’s principles can we
map the mpi locus as described below.
19. Molecular markers based on the sequence divergences between
the Arabidopsis Col-0 and Ler ecotypes are essential for
mapping. The most widely used molecular markers during
mapping are InDels, SSLPs, CAPS, and dCAPS [6, 9–11].
The advantages to these markers during mapping are as follows. First, they are all PCR-based markers and thus easy to
use and affordable. The most convenient markers are InDels
because they require only ordinary PCR and separation of
products on a high-concentration agarose gel. CAPS markers
need an additional enzyme digestion step between PCR and
running the gel. dCAPS markers are the same as CAPS markers except that sometimes the PCR products are not so specific
because of existing mismatches in the primers. Second, they
are codominant markers. Different products are amplified
from the chromosomes of Col-0 and Ler. These distinct products can be differentiated on an agarose gel. Many InDel
markers have been developed by different research groups, so
little effort is required to design primers for rough mapping.
For example, 25 InDel markers were recommended by our
group for rough mapping [10]. These InDel markers are easy
to use and distributed evenly across the five chromosomes of
Arabidopsis. As the mapping interval is narrowed, finding and
designing good markers becomes important for further
mapping. The “Monsanto Arabidopsis Polymorphism and Ler
Sequence Collections” on the TAIR website (http://www.
arabidopsis.org/browse/Cereon/index.jsp) support this process [6]. Sometimes we need to find sequence differences
between Col-0 and Ler manually using the BLAST software.
20. Usually 30 cycles is enough to amplify products, but some
primers have low amplification efficiency. Thus, we run the
PCR for 45 cycles when testing markers.
238
Li-Jia Qu and Genji Qin
21. A 4 % agarose gel is a rather high-concentration gel. To prepare
the gel, weigh out 4 g agarose powder and put it into a 500 mL
flask containing 100 mL cold TAE buffer. Heat the flask and
agitate the solution until the powder is completely dissolved.
The gel can be reused. The 3 % high-resolution Metaphor gel
is also a good choice for separating PCR products.
22. When calculating the recombinants, the plants from which
both bands are amplified are definitely recombinants. However,
when calculating the one band recombinants, we need to factor in different conditions. If the mpi is a recessive mutant in
the Col-0 background or a dominant one in the Ler background, the plants from which only the Ler band is amplified
are recombinants. If it is a dominant mutant in the Col-0
background or a recessive one in the Ler background, then
the plants from which only the Col-0 band is amplified are
recombinants.
23. When the recombinants are obtained, we locate the mpi locus
between the two markers that have the fewest recombinants.
If the mapping is accurate, the number of recombinants for
the markers at either of the two ends of the mpi locus will
decrease. That is, considering a marker distributed at one end
of the mpi locus, the recombinants for a marker far from the
mpi locus may become nonrecombinants for the marker near
the mpi locus. For example, as shown in Fig. 1, from marker
MG to MA, the number of recombinants decreases; and from
ME to MC, the number of recombinants also decreases, so the
mpi locus is located between markers MA and MC. The No. 2
and No. 24 plants are recombinants for MB, but are not
recombinants for MA.
24. The recombinants for the farther markers are always useful for
narrowing down the mapping interval when testing the nearer
markers, until they become nonrecombinants when testing a
nearest marker.
25. When no recombinants are usable, we need to screen for more
recombinants to narrow down the mapping interval using the
nearest marker.
26. If the nearest marker is not easy to use, we can perhaps use a
more convenient marker neighboring the nearest marker for
the screening. We can then use the recombinants to test the
nearest marker.
27. Sometimes a known mutant displays a similar phenotype to
the mpi and the known gene is in the mapping interval, in
which case we first need to sequence the gene and determine
if the mpi is an allele of the known mutant. Sometimes
the domain and structure of proteins encoded by genes in the
mapping interval plus the phenotype of the mpi may tell us
which is the most probable candidate gene.
Generation and Identification of EMS Mutants
239
28. This is another round to narrow down the mapping interval.
29. When the mapping interval become smaller, it becomes harder
to find a useful recombinant and more plants may need to be
included. It is also harder to find a good marker. In this situation, the majority of the work may be diverted to sequencing
genes in the mapping interval.
30. The coding regions of the most probable genes are sequenced
first, from our speculations based on publications, the characters
of the genes, and the phenotype of the mpi. If no mutations are
found, the coding regions of less probable genes are sequenced.
Noncoding regions may then be sequenced if mutations are still
not found. Of the mutations induced by EMS, 99 % are C/G to
T/A substitutions [3, 4].
31. The genomic sequence of the wild-type gene for the recessive
mutant and the mutated gene for the dominant mutant,
including the coding region, promoter region, and 3 ʹ UTR
region, may alternatively be used for complementation.
32. Like the analysis of T-DNA insertion mutants, complementation or recapitulation assays provide the most convincing
evidence in determining if the mutated gene causes the phenotype of the mpi.
References
1. Page DR, Grossniklaus U (2002) The art and
design of genetic screens: Arabidopsis thaliana.
Nat Rev Genet 3:124–136
2. Peters JL, Cnudde F, Gerats T (2003) Forward
genetics and map-based cloning approaches.
Trends Plant Sci 8:484–491
3. Kim Y, Schumaker KS, Zhu JK (2006) EMS
mutagenesis of Arabidopsis. Methods Mol Biol
323:101–103
4. Greene EA et al (2003) Spectrum of chemically induced mutations from a large-scale
reverse-genetic screen in Arabidopsis. Genetics
164:731–740
5. Jander G et al (2003) Ethylmethanesulfonate
saturation mutagenesis in Arabidopsis to determine frequency of herbicide resistance. Plant
Physiol 131:139–146
6. Lukowitz W, Gillmor CS, Scheible WR (2000)
Positional cloning in Arabidopsis. Why it feels
good to have a genome initiative working for
you. Plant Physiol 123:795–805
7. Zhang Y, Glazebrook J, Li X (2007) Identification of components in disease-resistance
8.
9.
10.
11.
12.
13.
signaling in Arabidopsis by map-based cloning.
Methods Mol Biol 354:69–78
Hardtke CS, Muller J, Berleth T (1996) Genetic
similarity among Arabidopsis thaliana ecotypes
estimated by DNA sequence comparison. Plant
Mol Biol 32:915–922
Jander G et al (2002) Arabidopsis map-based
cloning in the post-genome era. Plant Physiol
129:440–450
Hou X et al (2010) A platform of high-density
INDEL/CAPS markers for map-based cloning
in Arabidopsis. Plant J 63:880–888
Pacurar DI et al (2012) A collection of
INDEL markers for map-based cloning in seven
Arabidopsis accessions. J Exp Bot 63:2491–2501
Clarke JD (2009) Cetyltrimethyl ammonium
bromide (CTAB) DNA miniprep for plant
DNA isolation. Cold Spring Harb Protoc, pdb
prot5177. doi: 10.1101/pdb.prot5177
Copenhaver GP, Browne WE, Preuss D (1998)
Assaying genome-wide recombination and
centromere functions with Arabidopsis tetrads.
Proc Natl Acad Sci U S A 95:247–252
Chapter 13
Generation and Characterization of Arabidopsis
T-DNA Insertion Mutants
Li-Jia Qu and Genji Qin
Abstract
Transfer DNA (T-DNA) insertion mutants are often used in forward and reverse genetics to reveal the
molecular mechanisms of a particular biological process in plants. To generate T-DNA insertion mutants,
T-DNA must be inserted randomly in the genome through transformation mediated by Agrobacterium
tumefaciens. During generation of a T-DNA insertion mutant, Agrobacterium competent cells are first
prepared and plasmids containing the T-DNA introduced into Agrobacterium cells. Agrobacterium
containing T-DNA vectors are then used to transform T-DNA into Arabidopsis. After screening and identifying T-DNA insertion mutants with interesting phenotypes, genomic DNA is extracted from the
mutants and used to isolate the T-DNA flanking sequences. To finally determine the mutated genes causing the specific phenotype in the T-DNA insertion mutants, cosegregation analysis and complementation
or recapitulation analysis are needed. In this chapter, we describe detailed protocols for generation and
characterization of T-DNA insertion mutants.
Key words T-DNA insertion mutant, Floral dip, TAIL-PCR, Cosegregation, Complementation,
Recapitulation
1
Introduction
Transfer DNA (T-DNA) insertion mutants are widely used to elucidate
gene functions in genetic analyses of Arabidopsis. One advantage of
T-DNA mutagenesis is that the known T-DNA element can be used
as a possible tag when it disrupts or activates genes. The tag sequence
constitutes an easy tool to identify the gene defined by the T-DNA
mutation through isolating the adjacent genomic sequence without
painstaking mapping procedures. Another advantage of T-DNA
mutagenesis is that the T-DNA can include some elements, such as
different copies of the Cauliflower mosaic virus (CaMV) 35S promoter and reporter genes (i.e., GUS and GFP), allowing the generation of activation tagging, and promoter and enhancer trap lines.
These lines may be used to determine the function of redundant genes
and to identify genes displaying specific expression patterns.
Jose J. Sanchez-Serrano and Julio Salinas (eds.), Arabidopsis Protocols, Methods in Molecular Biology, vol. 1062,
DOI 10.1007/978-1-62703-580-4_13, © Springer Science+Business Media New York 2014
241
242
Li-Jia Qu and Genji Qin
Agrobacterium-mediated plant transformation has been used
to create T-DNA insertion mutants in Arabidopsis that have been
proved to be highly useful in forward and reverse genetics. Different
methods to generate T-DNA insertion mutants and to identify the
corresponding mutated genes have been reported in publications
and websites [1–17]. Here, we provide detailed procedures for the
method used routinely in our laboratory. Some steps of these protocols are fine-tuned for more efficient operation and are mentioned in the Notes section of this chapter.
2
Materials
2.1 Preparation
of Agrobacterium
tumefaciensContaining T-DNA
Vector
1. Agrobacterium strain GV3101 (pMP90) glycerol stock.
2.1.1 Preparation
of A. tumefaciens
Competent Cells
5. Liquid nitrogen.
2. Luria Bertani (LB) broth and agar plates supplemented with
antibiotics.
3. 10 mg/mL rifampicin and 50 mg/mL gentamicin.
4. 1,000 mL sterile, ice-cold 10 % glycerol (v/v).
6. Sterile 50 and 1,000 mL flasks.
7. Sterile 500 mL centrifuge bottles.
8. Ice-water bath.
9. Sterile 1.5 mL microcentrifuge tubes and tips.
10. Pipettes.
11. 28 °C Incubator and shaker.
12. Spectrophotometer.
13. Refrigerated centrifuge.
2.1.2 Transformation
of A. tumefaciens
Competent Cells
1. Agrobacterium competent cells.
2. About 30 ng/μL T-DNA plasmid DNA.
3. LB liquid medium and agar plates with appropriate antibiotics.
4. Sterile ddH2O.
5. Prechilled cuvettes for electroporation.
6. Paper towel.
7. MicroPulser™ Electroporation Apparatus (Bio-Rad) or other
electroporators.
8. 28 °C incubator and shaker.
9. Pipettes.
10. 1.5 mL sterile microcentrifuge tubes and tips.
11. Ice-cold water bath.
Arabidopsis T-DNA Mutants
2.2 Transformation
of Arabidopsis
243
1. Agrobacterium strain GV3101 (pMP90) containing appropriate
T-DNA vector.
2. LB broth.
3. Murashige and Skoog (MS) medium.
4. Sucrose.
5. Silwet L-77.
6. Selective antibiotics or herbicides, carbenicillin.
7. Silica-gel desiccant.
8. Sterile Petri dishes, 1,000 mL flask.
9. Centrifuge, 500 mL centrifuge bottles.
10. Plant soils and fertilizer.
11. Plant pots and trays.
12. Growth chambers and greenhouse maintained at 22 °C with
16 h light and 8 h dark photoperiod.
2.3 Identification
of T-DNA Insertion Site
2.3.1 Preparation
of Genomic DNA from
T-DNA Transgenic Plants
1. CTAB buffer: 2 % (w/v) cetyltrimethylammonium bromide
(CTAB), 100 mM Tris, 20 mM EDTA, and 1.4 M NaCl
(see Note 1).
2. Absolute ethanol and 70 % ethanol (prechilled in −20 °C
refrigerator).
3. Chloroform/isoamyl alcohol (24:1).
4. Sterile ddH2O.
5. 1 % Agarose gel, 6× loading buffer, and 1× TAE buffer.
6. Liquid nitrogen.
7. Sterile 1.5 mL microcentrifuge tubes and tips.
8. Plastic tissue-grinding pestles.
9. Micropipette.
10. 65 °C water bath.
11. Microcentrifuge.
12. Vortex mixer.
13. Agarose gel electrophoresis system.
2.3.2 Identification
of T-DNA Insertion Site
by TAIL-PCR
1. Genomic DNA from T-DNA insertion mutants.
2. T-DNA-specific primers: 1.5 μM LS1 for primary reaction,
2.0 μM LS2 for secondary reaction, 2.5 μM LS3 for tertiary
reaction, and 2.5 μM LS4 for sequencing PCR product (see
Note 2).
3. 20 μM arbitrary degenerate (AD) primers (see Note 3).
4. Taq DNA polymerase, 10× PCR buffer (MgCl2-free), 25 mM
MgCl2, and dNTPs mixture with 2.5 mM each of dATP, dCTP,
dGTP, and dTTP.
244
Li-Jia Qu and Genji Qin
5. Agarose gel.
6. Gel purification kit.
7. PCR thermocycler.
8. Pipettes.
9. Sterile PCR plates, microcentrifuge tubes and tips.
10. Agarose gel electrophoresis system.
11. Microcentrifuge.
2.4 Cosegregation
Analysis
1. Genomic DNA of plants from F2 segregation population.
2. Specific primers: 10 μM P1 and P2 designed from genomic
sequence flanking the T-DNA insert.
3. Taq DNA polymerase, 10× PCR buffer (MgCl2-free), 25 mM
MgCl2, and dNTPs mixture with 2.5 mM each of dATP, dCTP,
dGTP, and dTTP.
4. Agarose gel.
5. PCR thermocycler.
6. Pipettes.
7. Sterile PCR plates, microcentrifuge tubes and tips.
8. Agarose gel electrophoresis system.
9. Microcentrifuge.
2.5 Complementation and
Recapitulation
Analysis
1. Plasmid DNA of plant binary vector containing CaMV 35S
promoter.
2. Competent cells of Agrobacterium strain GV3101 (pMP90).
3. LB broth and agar plates with antibiotics.
4. MS medium, sucrose, Silwet L-77.
5. Selective antibiotics or herbicides, carbenicillin.
6. Sterile 50 and 1,000 mL flasks.
7. Sterile 500 mL centrifuge bottles.
8. 28 °C incubator and shaker.
9. MicroPulser™ Electroporation Apparatus (Bio-Rad) or other
electroporators.
10. Ice-cold water bath.
11. Pipettes, microcentrifuge tubes and tips.
12. Microcentrifuge.
13. Silica-gel desiccant.
14. Sterile Petri dishes.
15. Plant soils and fertilizer, plant pots, and trays.
16. Growth chambers and greenhouse at 22 °C with a 16 h light
and 8 h dark photoperiod.
Arabidopsis T-DNA Mutants
3
245
Methods
3.1 Preparation
of Agrobacterium
tumefaciensContaining T-DNA
Vector
3.1.1 Preparation
of A. tumefaciens
Competent Cells
1. Streak the A. tumefaciens strain GV3101 (pMP90) glycerol
stock on the LB plate supplemented with 10 μg/mL rifampicin and 50 μg/mL gentamicin and incubate the plate at 28 °C
for 2 days (see Note 4).
2. Transfer a single colony to 5 mL LB broth supplemented with
10 μg/mL rifampicin and shake the culture at 220 rpm in a
28 °C incubator for 1 day (see Note 5).
3. Add the 5 mL culture to 500 mL LB broth supplemented
with 10 μg/mL rifampicin and incubate it with shaking at
220 rpm and 28 °C overnight to an OD600 of 0.5–0.8
(see Note 6).
4. Decant the 500 mL culture into two sterile 500 mL centrifuge
bottles for balance and centrifuge the bottles at 3,000 × g for
15 min at 4 °C (see Note 7).
5. Pour off carefully the supernatant and add 250 mL sterile, icecold 10 % glycerol. Shake the bottles by hand in an ice-water
bath to resuspend the cell pellets.
6. Pellet the cells again by centrifugation at 3,000 × g for 15 min
at 4 °C and discard the supernatant.
7. Add 250 mL sterile, ice-cold 10 % glycerol and resuspend the
cell pellets again in the ice-water bath.
8. Repeat step 6.
9. Add 2 mL sterile, ice-cold 10 % glycerol to each pellet and
resuspend it in the ice-water bath.
10. Transfer the cell suspension to prechilled 1.5 mL Eppendorf
tubes and put these tubes on ice.
11. Dispense 40 μL of the competent cells into the prechilled
microcentrifuge tubes on the ice.
12. Freeze the cells in the liquid nitrogen and then store at −70 °C
(see Note 8).
3.1.2 Transformation
of A. tumefaciens
Competent Cells
by Electroporation
1. Prepare the plasmid of the T-DNA vector and adjust the concentration to about 30 ng/μL (see Note 9).
2. Remove the A. tumefaciens competent cells from the −70 °C
freezer and thaw on ice.
3. Add 0.5 mL LB to 1.5 mL sterile Eppendorf tube and mark
with the vector name. Prechill a 0.1 cm electroporation cuvette
on ice (see Note 10).
4. Mix 1 μL plasmid with the A. tumefaciens competent cells by
pipetting up and down and put on ice for 5 min.
246
Li-Jia Qu and Genji Qin
5. While waiting, set the MicroPulser™ Electroporation Apparatus
to the “Agr” preprogram (the voltage for A. tumefaciens is
2.2 kV).
6. Add the mixture to the prechilled electroporation cuvette and
wipe the outsides of the cuvette with the paper towel to absorb
the condensation.
7. Put the cuvette in the chamber slide and push the slide into the
chamber until the cuvette is seated between the contacts in the
base of the chamber. Press the “Pulse” button once and a beep
sound will be heard.
8. Remove the cuvette from the chamber and add the prepared
0.5 mL LB broth to the cuvette immediately (see Note 11).
9. Pipette up and down and transfer the A. tumefaciens solution
back to the 1.5 mL sterile Eppendorf tube (see Note 12).
10. Incubate the tube at 28 °C with shaking at 220 rpm for 3–4 h
to allow cell recovery.
11. Plate 30–50 μL A. tumefaciens on the LB agar media containing the antibiotic for selection of the target T-DNA vector.
Place the plate in the 28 °C incubator for about 2–3 days (see
Note 13).
12. Add 1 mL LB broth containing selective antibiotic to 1.5 mL
tubes and mark the tubes. Transfer several single colonies into
these tubes, respectively. Incubate these tubes at 28 °C with
shaking at 220 rpm for 2 days.
13. Perform the PCR analysis with 1 μL A. tumefaciens culture as
template to verify the existence of the T-DNA plasmid in the
positive colonies (see Note 14).
14. Add the cultures of all positive colonies containing the target
T-DNA plasmid to 500 mL LB broth supplemented with
appropriate antibiotics. Incubate the cultures at 28 °C with
shaking at 220 rpm for 1–2 days (see Note 15).
15. Collect the A. tumefaciens cells by centrifugation for plant
transformation.
3.2 Transformation
of Arabidopsis by
Floral Dip Method
1. Grow 12 plants per pot (8 cm × 8 cm) in the tray at 22 °C with a
16 h light and 8 h dark photoperiod. Spray with liquid fertilizer
every week to obtain healthy Arabidopsis plants (see Note 16).
2. Prepare the A. tumefaciens cells containing the target
T-DNA plasmid as described in step 14 of Subheading 3.1.2
(see Note 17).
3. Pellet the cells by centrifugation at 3,000 × g for 15 min at
room temperature and discard the supernatant.
4. Resuspend the pellets in 250 mL solution containing halfstrength MS salts plus 5 % sucrose and 0.03 % Silwet L-77
Arabidopsis T-DNA Mutants
247
surfactant. Pour the suspension into a container such as the
Petri dish for floral dipping (see Note 18).
5. Select healthy plants with a lot of unopened flower buds for transformation. Immerse all inflorescences into the A. tumefaciens cell
suspension for 5–15 min and allow all flower buds to be dipped
in the suspension (see Note 19).
6. Put the dipped plants in a deep tray. Cover the tray with a
transparent glass cover to maintain the high humidity for about
24 h (see Note 20).
7. Remove the cover the following day. Water the plants from the
bottom of the tray and transfer them to the greenhouse at
22 °C with a 16 h light and 8 h dark photoperiod.
8. Water and take care of the dipped plants. Stop watering when
most siliques of the dipped plants become yellow (see Note 21).
9. Harvest the seeds from the dipped plants using the sieve mesh
and put the seeds into the 1.5 mL microcentrifuge tubes. Add
some silica-gel desiccant into the tubes to dry the seeds (see
Note 22).
10. Prepare selection plates containing 1/2 MS medium plus
selective antibiotics or herbicides. Sterilize the seeds routinely
and spread them on the plates (see Note 23).
11. Incubate the plates at 4 °C for 2–3 days for synchronization.
Transfer the plates to the growth chamber at 22 °C with a 16 h
light and 8 h dark photoperiod for 7–10 days.
12. Transfer putative T1 transformants with green cotyledons and
leaves to soil and grow them in the greenhouse at 22 °C with
a 16 h light and 8 h dark photoperiod.
3.3 Identification
of T-DNA Insertion Site
by TAIL-PCR Method
3.3.1 Preparation of DNA
from T-DNA Transgenic
Plants with the
CTAB Method
1. Screen the T-DNA insertion mutants with an interesting phenotype from T-DNA transformants (see Note 24).
2. Harvest about 50–100 mg leaves (or one medium-sized leaf)
and place in the 1.5 mL microcentrifuge tube. Grind the tissues to a fine powder in liquid nitrogen using a plastic tissuegrinding pestle.
3. Add 400 μL 65 °C preheated 2 % CTAB extraction buffer and
mix well using the pestle.
4. Incubate the microcentrifuge tube in the 65 °C water bath for
10 min to 2 h. Mix every 10–30 min.
5. Add 400 μL of chloroform/isoamyl alcohol (24:1) and vortex
the solution vigorously.
6. Centrifuge at 11,340 × g for 10 min at room temperature.
7. Transfer about 300 μL of the upper aqueous phase carefully to
a new tube (see Note 25).
248
Li-Jia Qu and Genji Qin
8. Add 600 μL −20 °C prechilled absolute ethanol to each tube
and mix well by inverting the tubes. Place the tube in a −20 °C
icebox for at least 30 min (see Note 26).
9. Centrifuge at 11,340 × g for 10 min. Discard the supernatant.
10. Add 500 μL −20 °C prechilled 70 % ethanol to wash the DNA
pellets for 5–10 min.
11. Centrifuge at 11,340 × g for 10 min. Discard the supernatant
carefully (see Note 27).
12. Dry the DNA pellets by inverting the tubes on the paper towel
(see Note 28).
13. Add 20–50 μL sterile ddH2O to dissolve the DNA (see Note 29).
14. Run an agarose gel to determine the quality and quantity of
the DNA. Choose the DNA with good quality for TAIL-PCR
(see Note 30).
3.3.2 Identification
of T-DNA Insertion Site
by TAIL-PCR
1. Thaw 10× PCR buffer (MgCl2-free), 25 mM MgCl2, 2.5 mM
each dNTPs mixture, and 1.5 μM LS1 and 20 μM AD primers
in one’s hand. Keep the solution on ice after thawing.
2. Perform the primary reaction in a total volume of 10 μL. Add
1 μL of different DNA samples (about 20 ng/μL) to each PCR
tube/well of the PCR plate when obtaining flanking sequences
from multiple T-DNA lines with one AD primer. Add 1 μL of
different AD primers in each tube/well when obtaining flanking sequences from one mutant of interest with different AD
primers. Put the PCR plate/tubes on ice (see Note 31).
3. Prepare a master mixture of the primary reaction in a sterile
1.5 mL microcentrifuge tube. Each reaction contains the following reagents in the mixture: 1 μL of 10× PCR buffer,
0.8 μL of 25 mM MgCl2, 0.8 μL dNTPs mixture, 1.0 μL of
1.5 μM LS1, 1.0 μL of 20 μM AD primers, and 0.5 U of Taq
DNA polymerase; add ddH2O to 10 μL. Briefly mix and centrifuge (see Note 32).
4. Add 9 μL of master mixture to each tube/well. Add a drop of
paraffin oil using pipettes to each tube/well. Briefly mix and
centrifuge. Place the plate/tubes on ice.
5. Program the thermocycler for the primary reaction as follows:
92 °C for 3 min and 95 °C for 30 s; 5 cycles of 94 °C for 30 s,
65 °C for 1 min, and 72 °C for 2 min; 94 °C for 30 s, 25 °C for
2 min, ramping to 72 °C over 2 min, and 72 °C for 2 min; 14
cycles of 94 °C for 10 s, 65 °C for 1 min, 72 °C for 2 min, 94 °C
for 10 s, 65 °C for 1 min, 72 °C for 2 min, 94 °C for 10 s, 44 °C
for 1 min, and 72 °C for 2 min; and 72 °C for 5 min.
6. Place the tubes/plate into the block and run the thermocycler
program. After completion, place the PCR products on ice to
Arabidopsis T-DNA Mutants
249
continue the secondary reaction, or store them at −20 °C until
the secondary reaction is performed.
7. To continue with the secondary reaction, thaw 10× PCR buffer (MgCl2-free), 25 mM MgCl2, 2.5 mM each dNTPs mixture, and 2.0 μM LS2 and 20 μM AD primers. Place on ice
after thawing.
8. Perform the secondary reaction in a total volume of 10 μL.
Dilute 2 μL of each product from the primary reaction in
80 μL of ddH2O. Add 2 μL dilutions to each tube/well as
template (see Note 33).
9. Prepare the master mixture of the secondary reaction in a sterile 1.5 mL microcentrifuge tube. Each reaction contains the
following reagents in the mixture: 1 μL of 10× PCR buffer,
0.6 μL of 25 mM MgCl2, 0.8 μL dNTPs mixture, 1.0 μL of
2.0 μM LS2, 0.8 μL of 20 μM AD primers, and 0.3–0.5 U of
Taq DNA polymerase; add ddH2O to bring to a total volume
of 10 μL. Briefly mix and centrifuge.
10. Add 9 μL master mixture to each tube/well. Add a drop of
paraffin oil using pipettes to each tube/well. Briefly mix and
centrifuge. Place the plate/tubes on the ice.
11. Set up the PCR program for the secondary reaction. The program is 12–14 cycles of 94 °C for 10 s, 65 °C for 1 min, 72 °C
for 2 min, 94 °C for 10 s, 65 °C for 1 min, 72 °C for 2 min,
94 °C for 10 s, 45 °C for 1 min, 72 °C for 2 min; then 72 °C
for 5 min.
12. Place the tubes/plate in the block and run the thermocycler
program. After completion, place the PCR products on ice to
continue the tertiary reaction, or store them at −20 °C (see
Note 34).
13. To continue with the tertiary reaction, thaw 10× PCR buffer
(MgCl2-free), 25 mM MgCl2, 2.5 mM each dNTPs mixture,
and 2.5 μM LS3 and 20 μM AD primers. Place on ice after
thawing.
14. Perform the tertiary PCR amplification in a 50 μL volume.
Dilute 2 μL of each product from the secondary reaction in
20 μL of ddH2O. Add 2 μL dilutions to each tube/well as
template (see Note 35).
15. Prepare the master mixture of the tertiary reaction in a sterile
1.5 mL microcentrifuge tube. Each reaction contains the following reagents for preparing the mixture: 5 μL of 10× PCR
buffer, 4 μL of 25 mM MgCl2, 4 μL of dNTPs mixture, 5 μL
of 2.5 μM LS2, 0.8 μL of 20 μM AD primers, and 1.0–1.2 U
of Taq DNA polymerase; add ddH2O to bring to 50 μL. Briefly
mix and centrifuge.
250
Li-Jia Qu and Genji Qin
Fig. 1 Secondary products (marked as 2 ) and the corresponding tertiary products (marked as 3 ) from a specific sample were run on the agarose gel side by side. The size shifts in the gel reveal the product specificity.
Red arrows indicate one example of a specific product with an obvious size shift (Color figure online)
16. Add 48 μL master mixture to each tube/well. Add a drop of
paraffin oil using pipettes to each tube/well. Briefly mix and
centrifuge. Place the plate/tubes on ice.
17. Set up the program for the tertiary reaction. The program is
23–25 cycles of 94 °C for 10 s, 45 °C for 1 min, 72 °C for
2 min; then 72 °C for 5 min.
18. Place the tubes/plate into the block and run the thermocycler
program. After completion, run the gel to analyze the products, or store them at −20 °C.
19. To analyze the PCR products, run 5 μL of the secondary and
tertiary products derived from the same primary reaction side
by side on a 1.0 % agarose gel.
20. The specific products are indicated by the expected size shift
between the secondary and tertiary products, whereas nonspecific DNA bands display the same size or a wrong size shift
(Fig. 1). Record the size of specific bands of tertiary products.
21. Run the other 45 μL of tertiary products with specific bands
on a new 0.8 % agarose gel. Excise the specific gel bands. Purify
the DNA using a gel purification kit as described in the instructions manual.
22. Determine the sequence of the purified DNA using the specific
primer LS4 with a company providing a DNA sequencing service (see Note 36).
23. Align the obtained sequence with the Arabidopsis genome
sequence in the NCBI databases (http://blast.ncbi.nlm.nih.gov/
Blast.cgi) using nucleotide BLAST software to determine the
location of the T-DNA insert in the T-DNA insertion mutant.
3.4 Cosegregation
Analysis
1. Design two specific primers P1 and P2 on the basis of the two
flanking sequences on the Arabidopsis genome of the T-DNA
insert identified by TAIL-PCR (Fig. 2) (see Note 37).
Arabidopsis T-DNA Mutants
251
Fig. 2 Design of primers for cosegregation analysis. (a) Schematic representation of a T-DNA insert with four
CaMV 35S enhancers in the chromosome of one mutant. The black arrows represent the DL1, P1, and P2 primers used in the cosegregation analysis. LB, T-DNA left border; RB, T-DNA right border; 4Enhancers, four CaMV
35S enhancers; bar, Basta resistance gene. (b) Cosegregation analysis of the T-DNA insert with the specific
phenotype of the mutant. The 615 bp DNA bands were amplified from the wild-type genomic DNA, whereas
the 787 bp bands were amplified from the homozygous mutant genomic DNA. Both bands were amplified from
the genomic DNA of the heterozygous mutants
2. Cross the T-DNA insertion mutant with wild-type Arabidopsis
to obtain F1 seeds. Allow the F1 plants to self-fertilize to obtain
F2 seeds. Germinate the F2 seeds to obtain a F2 segregation
population (see Note 38).
3. Prepare genomic DNA from the F2 plants. Record the phenotypes of each individual.
4. Thaw 10× PCR buffer (MgCl2-free), 25 mM MgCl2, 2.5 mM
each dNTPs mixture, and 10 μM LS3, P1, and P2 primers.
Place on ice after thawing.
5. Perform the PCR reaction in a total volume of 10 μL. Add
1 μL genomic DNA from the F2 plants to each tube/well as
template.
6. Prepare the master mixture in a sterile 1.5 mL microcentrifuge
tube. Each reaction contains the following reagents in the mixture: 1 μL of 10× PCR buffer, 0.6 μL of 25 mM MgCl2, 0.8 μL
of dNTPs mixture, 0.2 μL of 10 μM LS3, P1, and P2 primers,
and 0.3–0.5 U Taq DNA polymerase, and add ddH2O to bring
the volume to 10 μL. Briefly mix and centrifuge.
7. Add 9 μL master mixture to each tube/well. Add a drop of
paraffin oil using pipettes to each tube/well. Briefly mix and
centrifuge. Place the plate/tubes on ice.
252
Li-Jia Qu and Genji Qin
8. Set up the PCR program for the PCR reaction. The program is
95 °C for 2 min; 35 cycles of 94 °C for 10 s, 58 °C for 30 s,
72 °C for 1 min; and then 72 °C for 5 min.
9. Place the tubes/plate into the block and run the thermocycler
program. After completion, run a 1 % agarose gel to determine
the presence of the T-DNA insert. No band amplified by LS3
and either P1 or P2 is obtained from wild-type plants. No band
amplified by P1 and P2 is obtained from the homozygous
mutant. Both bands are amplified from heterozygous plants
(Fig. 2).
10. Analyze the genotyping data in relation to the phenotypes. If all
plants with a specific phenotype carry the T-DNA insert,
whereas those with no phenotype do not carry the insert, the
specific phenotype is cosegregated with the T-DNA insert.
3.5 Complementation and
Recapitulation
Analysis
1. Align the flanking sequence of the cosegregated T-DNA with
the Arabidopsis genome to determine the position of the insert
in the possible mutated gene (see Note 39).
2. Identify the mutant as a loss-of-function or gain-of-function
mutant. If it is a loss-of-function mutant, complementation
analysis is needed. If it is a gain-of-function mutant, recapitulation analysis is needed (see Note 40).
3. Clone the target gene. Prepare the construct in which the gene
is driven by the CaMV35S promoter. Prepare the Agrobacterium
cells containing the plasmid of the construct (see Note 41).
4. Transform the CaMV35S promoter-driven gene into the
mutants by the floral dip method if the mutant is a loss-offunction mutant. If it is a gain-of-function mutant, transform
the CaMV35S promoter-driven gene into wild-type plants (see
Note 42).
5. Harvest seeds. Screen the transformants on 1/2MS containing
an appropriate selection antibiotic or herbicide.
6. Observe the phenotype of the T1 generation. In complementation analysis, if the mutant transformants recover the wild-type
phenotype, the mutant is complemented by the target gene. In
recapitulation analysis, if the phenotype of wild-type transformants mimics that of the mutant, it is concluded that target
gene activation leads to the specific phenotype of the T-DNA
insertion mutant (see Note 43).
4
Notes
1. To prepare 100 mL of 2 % CTAB extraction buffer, add 2 g
CTAB, 10 mL of 1 M Tris–HCl (pH 8.0), 4 mL of 0.5 M
EDTA (pH 8.0), and 8.19 g NaCl, and add water to bring to a
final volume of 100 mL. Sterilize and store the solution at
Arabidopsis T-DNA Mutants
253
room temperature. Preheat it at 65 °C and add 0.2–0.5 %
β-mercaptoethanol before use.
2. Specific primers are designed to be complementary to the
sequence neighboring the left or right border of the T-DNA
vector used for T-DNA tagging. The four primers are designed
to be nested. The Tm values of primer LS1 and LS2 are designed
to be about 62–65 °C, and those of LS3 and LS4 could be
lower as for ordinary primers. Some vectors are introduced
with some elements such as four repeats of the CaMV 35S
promoter enhancer. The repeated sequence cannot be used for
the design of the specific primers. About 50–70 bp distance
between LS2 and LS3 is designed for determination of the
specificity of the PCR product by the obvious gel shift difference. The primers are dissolved in sterile ddH2O to a concentration 100 μM and stored in the freezer. Dilute to the
appropriate concentration before use.
3. AD primers can be used in TAIL-PCR experiments for obtaining the genomic sequence flanking different T-DNA vectors or
even any unknown sequence flanking a known sequence in different species. The characteristics of AD primers are 15–16
nucleotides in length with a Tm value of about 45 °C and
64–256 times degeneracy. We use the following 11 AD primers, among which the AD1, AD2, AD3, and AD4 primers were
previously designed [14] and used in most TAIL-PCR studies.
The other AD primers used in our laboratory were based on
AD1 to AD4, but certain nucleotides were altered and thus the
new primers were named after them. These AD primers are
AD1: 5′-NTC GA(G/C) T(A/T)T (G/C)G (A/T)G TT-3′;
AD1-1: 5′-NAC GT(G/C) A(A/T)T (G/C)C NAG A-3′;
AD1-2: 5′-NTC GA(G/C) T(A/T) TNG (A/T)G AA-3′;
AD2: 5′-NGT CGA (G/C)(A/T)G ANA (A/T)G AA-3′;
AD2-1: 5′-NTC GT(G/C) (A/T)G ANA (A/T)GT T-3′;
AD2-2: 5′-NCA GCT (G/C)(A/T)C TNT (A/T)GA A-3′;
AD2-2: 5′-NCA GCT (G/C)(A/T)C TNT (A/T)GA A-3′;
AD2-3: 5′-NCT CGT (G/C)(A/T)G ANT (A/T)GA T-3′;
AD2-5: 5′-NGT CGA (G/C)(A/T)C TNA (A/T)CA A-3′;
AD3: 5′-(A/T)GT GNAG(A/T)ANCANAGA-3′; AD4:
5′-AG(A/T) GNA G(A/T)A NCA (A/T)AG G-3′; and AD41: 5′-AG(A/T) CAN G(A/T)T NCA (A/T)GA A-3′.
4. The A. tumefaciens strain GV3101 (pMP90) is resistant to
antibiotics rifampicin and gentamicin. If preparing competent
cells of other A. tumefaciens strains, supplement the medium
with appropriate antibiotics. The purpose of streaking for single colonies is to obtain genetically identical A. tumefaciens
cells and activate the cells at the same time.
5. Supplement the liquid culture with rifampicin only to save on
expenses. Gentamicin is slightly expensive and the possibility
254
Li-Jia Qu and Genji Qin
of contamination is low because of streaking for single
colonies on the plate with both antibiotics.
6. It will take about 8–12 h for the A. tumefaciens cells to grow
to a concentration with an OD600 of 0.5–0.8. After the cells
reach the log phase, keep the culture in an ice-water bath and
all following steps are carried out in the ice-water bath or under
4 °C.
7. The rotor of the centrifuge needs to be prechilled before centrifuging. It may be placed in the 4 °C refrigerator or cold
room for prechilling.
8. The competent cells are stable for more than 6 months at
−70 °C. However, the competent cells could be used even after
storage for several years under −70 °C. The efficiency is sufficient for transformation of T-DNA plasmids into A. tumefaciens competent cells.
9. The plasmid DNA should be dissolved in either ddH2O or 1/2
TE. An overly high DNA concentration or ionic strength will
probably cause the pulse to be too intense and have a low
transformation rate.
10. The electroporation cuvettes should be prechilled. Usually, for
convenience, just store the cuvettes at −20 °C and place on ice
before use.
11. Don’t transfer the A. tumefaciens cells in the cuvettes directly
to the recovery tube. Wash down the cells with liquid LB and
then transfer the mixture to the tube.
12. For convenience, 1.5 mL tubes with 0.5 mL LB broth works
fine for cell recovery.
13. Usually the efficiency of transformation is sufficiently high. To
be sure of obtaining single colonies, plate the A. tumefaciens
cells on one half of the selective medium and streak on the
other half.
14. Don’t carry out the PCR analysis using single colonies as template directly. The false-positive rate is high because of the
trace plasmid DNA on the plate.
15. Mix all cultures of positive colonies in case the single colony
selected is a false-positive. If you do not perform plant transformation immediately, add the same volume of sterile 50 %
glycerol to the culture and store the culture at −70 °C.
16. Healthy Arabidopsis plants are very important to obtain a sufficient number of transformants by the floral dip method.
Grow plants at a high density to prevent the soil falling when
inverting the pots for dipping.
17. If the A. tumefaciens cells containing the target T-DNA plasmid are stored at −70 °C, remove them from the freezer and
Arabidopsis T-DNA Mutants
255
inoculate 25 mL LB broth containing appropriate antibiotics.
Incubate the cells with shaking at 220 rpm at 28 °C for 2 days.
Pour the 25 mL LB culture into 500 mL LB broth supplemented with antibiotics and incubate the culture at 28 °C with
shaking at 220 rpm for 1 day to obtain a sufficient number of
A. tumefaciens cells.
18. Although it is reported that MS salts do not increase the transformation rate, the salts provide nutrients for the plants. The
plants dipped in this solution grow better than those dipped in
solution lacking MS salts. An excessive concentration of the
surfactant Silwet L-77 harms the inflorescences and leads to
low fertility.
19. If the A. tumefaciens cells containing the target T-DNA plasmid are not ready or to obtain plants with a higher number of
immature flower buds, clip the first bolts a week before dipping to allow more secondary inflorescences to develop.
Clipping the siliques of the plants may increase the transformation rate when performing floral dipping.
20. If the bolts of the dipped plants are too high to put in the deep
tray, lay the plants on their sides and wrap the plants with the
plastic film to maintain the moisture for about 24 h.
21. If watering is withheld too early, most of the seeds harvested
from the dipped inflorescences will not germinate normally
and thus lead to failure of transformation.
22. The seeds can be stored at 4 °C with silica-gel desiccant for
more than 1 year.
23. The selection medium can be supplemented with 100–
200 mg/L carbenicillin to inhibit bacterial contamination.
Scatter the seeds well on the selection medium. Sowing the
seeds at an overly high density will affect the selection.
24. The dominant T-DNA insertion mutants can be obtained from
T1 transformants by activation tagging. To obtain the recessive
T-DNA insertion mutants, T2 transgenic mutants should be
used. To facilitate screening, we mix the seeds of about 10,000
T-DNA insertion lines and screen the seed pool for mutants
with interesting phenotypes. After obtaining mutants of interest, genomic DNA is isolated and T-DNA insertion sites are
determined.
25. Contamination with trace chloroform affects the next enzyme
reaction. To avoid contamination, discard about 100 μL aqueous phase and transfer only 300 μL to the new tube.
26. Addition of sodium acetate to the aqueous phase is not needed
before ethanol precipitation of DNA because of the existence
of NaCl in the CTAB extraction buffer.
27. Discard the supernatant gently to prevent losing the DNA
pellet.
256
Li-Jia Qu and Genji Qin
28. Do not let the DNA pellet to become overly dry. Overdried
DNA pellets are hard to dissolve.
29. 0.5 % volume of RNase (10 mg/mL) can be added to the
DNA solution to remove RNA contamination.
30. After running the gel, if the high-molecular-weight band is
present, the DNA is of good quality. If a smeared band is present, the DNA is degraded. We use 10, 20, 30, and 50 ng
lambda DNA standards to quantify DNA.
31. When obtaining the flanking sequence from multiple T-DNA
mutants (e.g., for generation of a T-DNA mutant collection),
perform first the TAIL-PCR with one AD primer such as AD2,
which gives a higher successful rate in reactions. Those samples
in which amplification is unsuccessful are selected for performing the TAIL-PCR using a different AD primer. When obtaining the flanking sequence from a specific mutant (e.g., a mutant
of interest obtained by painstaking screening), perform TAILPCR using all the AD primers to get a greater chance of capturing the target T-DNA flanking sequence, because multiple
T-DNA inserts are frequently present in the genome of mutants
of interest.
32. Prepare master mixture for 1–2 additional reactions in order to
guarantee there is a sufficient volume for all reactions.
33. Mark tubes/plate clearly to make sure that one can trace back
to the correct T-DNA lines. The dilutions can be stored in the
freezer for at least 1 month. If using different AD primers, add
AD primers to each tube/well.
34. Three microliters of the products from the secondary PCR
reaction can be checked on an agarose gel. Those samples that
display bright DNA bands on the gel are selected to continue
the tertiary reaction, whereas those samples that have no obvious DNA bands can be discarded because the chances of
obtaining specific products from them are low.
35. If using different AD primers, add AD primers to each tube/
well in the tertiary reaction.
36. TA cloning must be performed if the PCR product is a mixture
of different products, such as the similar-sized products from
two different inserts in one T-DNA mutant.
37. When designing the primers, calculate the size of the product
amplified from the chromosome without the T-DNA insert by
the specific primers and the size of the product from the chromosome with the T-DNA insert by one of the two specific
primers and LS3 or LS4 on the T-DNA insert. Make sure that
the size of the two amplified products can be differentiated on
the agarose gel. This enables genotyping of the segregation
population using the three primers in one PCR reaction.
Arabidopsis T-DNA Mutants
257
38. The progenies of the heterozygous mutant can be used as the
segregation population for cosegregation analysis. The population is composed of 300–500 plants.
39. Three kinds of T-DNA insert position may be located in one
gene, that is, the intergenic region, intron, or exon.
40. If a T-DNA insert carrying activation elements is inserted in an
intergenic region, it may cause a gain-of-function mutation.
Sometimes, T-DNA without an activation element located in
the 5′ or 3′ untranslated region (UTR) also leads to gene activation. If the T-DNA insertion is located in the intron or exon,
it may lead to gene knockdown or knockout. For a gain-offunction mutant, we need to perform RT-PCR analysis of
genes in the vicinity of the T-DNA insert to identify which
gene is activated. For knockdown or knockout mutants, the
target gene is the one in which the T-DNA insert is located.
41. Alternatively, the genomic sequence including the coding
region, promoter region, and 3′ UTR is used for complementation analysis.
42. When transforming the T-DNA insertion mutants, use a different selection marker from that in the T-DNA insert of the
mutant.
43. Complementation analysis or recapitulation analysis is the most
convincing evidence for determination of the target gene that
leads to the specific phenotypes of a T-DNA insertion mutant.
References
1. Krysan PJ, Young JC, Sussman MR (1999)
T-DNA as an insertional mutagen in
Arabidopsis. Plant Cell 11:2283–2290
2. Wilson RN, Somerville CR (1995) Phenotypic
suppression of the gibberellin-insensitive
mutant (gai) of Arabidopsis. Plant Physiol
108:495–502
3. Weigel D et al (2000) Activation tagging in
Arabidopsis. Plant Physiol 122:1003–1013
4. Engineer CB et al (2005) Development and
evaluation of a Gal4-mediated LUC/GFP/
GUS enhancer trap system in Arabidopsis.
BMC Plant Biol 5:9
5. Radhamony RN, Prasad AM, Srinivasan R
(2005) T-DNA insertional mutagenesis in
Arabidopsis: a tool for functional genomics.
Electron J Biotechnol 8:82–106
6. Mattanovich D et al (1989) Efficient transformation of Agrobacterium spp. by electroporation. Nucleic Acids Res 17:6747
7. Shen WJ, Forde BG (1989) Efficient transformation of Agrobacterium spp. by high voltage
electroporation. Nucleic Acids Res 17:8385
8. Clough SJ, Bent AF (1998) Floral dip: a
simplified method for Agrobacterium-mediated
transformation of Arabidopsis thaliana. Plant J
16:735–743
9. Clough SJ (2005) Floral dip: Agrobacteriummediated germ line transformation. Methods
Mol Biol 286:91–102
10. Bent AF (2006) Arabidopsis thaliana floral dip
transformation method. Methods Mol Biol
343:87–103
11. Zhang X et al (2006) Agrobacterium-mediated
transformation of Arabidopsis thaliana using
the floral dip method. Nat Protoc 1:641–646
12. Clarke JD (2009) Cetyltrimethyl ammonium
bromide (CTAB) DNA miniprep for plant
DNA isolation. Cold Spring Harb Protoc 2009,
pdb prot. 5177. doi:10.1101/pdb.prot5177
13. Liu YG, Huang N (1998) Efficient amplification
of insert end sequences from bacterial artificial
chromosome clones by thermal asymmetric interlaced PCR. Plant Mol Biol Rep 16:175–181
14. Liu YG et al (1995) Efficient isolation and
mapping of Arabidopsis thaliana T-DNA insert
258
Li-Jia Qu and Genji Qin
junctions by thermal asymmetric interlaced
PCR. Plant J 8:457–463
15. Liu YG, Whittier RF (1995) Thermal asymmetric interlaced PCR-automatable amplification and sequencing of insert end fragments
from P1 and YAC clones for chromosome
walking. Genomics 25:674–681
16. Qin G et al (2005) An indole-3-acetic acid carboxyl methyltransferase regulates Arabidopsis
leaf development. Plant Cell 17:2693–2704
17. Qin G et al (2007) Arabidopsis AtBECLIN 1/
AtAtg6/AtVps30 is essential for pollen germination and plant development. Cell Res
17:249–263
Chapter 14
Identification of EMS-Induced Causal Mutations
in Arabidopsis thaliana by Next-Generation Sequencing
Naoyuki Uchida, Tomoaki Sakamoto, Masao Tasaka,
and Tetsuya Kurata
Abstract
Emerging next-generation sequencing (NGS) technologies are powerful tools for the identification of
causal mutations underlying phenotypes of interest in Arabidopsis thaliana. Based on a methodology
termed bulked segregant analysis (BSA), whole-genome sequencing data are derived from pooled F2 segregants after crossing a mutant to a different polymorphic accession and are analyzed for single nucleotide
polymorphisms (SNPs). Then, a genome region spanning the causal mutation site is narrowed down by
linkage analysis of SNPs in the accessions used to produce the F1 generation. Next, candidate SNPs for the
causative mutation are extracted by filtering the linked SNPs using multiple appropriate criteria. Effects of
each candidate SNP on the function of the corresponding gene are evaluated to identify the causal mutation,
and its validity is then confirmed by independent criteria. This chapter describes the identification by NGS
analysis of causal recessive mutations derived from EMS mutagenesis.
Key words Next-generation sequencing, Whole-genome sequencing, Ethyl methanesulfonate, Bulked
segregant analysis
1 Introduction
Though mutagenesis-based approaches have been used in
Arabidopsis in various biological studies, it is still laborious and timeconsuming to define the mutations causing phenotypes of interest
by conventional means such as map-based cloning. Emerging nextgeneration sequencing (NGS) technologies are powerful and versatile tools which are now being used for the rapid, cost-effective
identification of spontaneous as well as mutagenesis-induced mutations in Arabidopsis [1–7].
To identify the mutation behind an interesting effect, wholegenome sequencing without genetic manipulation followed by
comparison of genomic sequences between the mutant and its
parental line might appear to be the simplest approach. This strategy, however, is problematic since numerous background mutations
Jose J. Sanchez-Serrano and Julio Salinas (eds.), Arabidopsis Protocols, Methods in Molecular Biology, vol. 1062,
DOI 10.1007/978-1-62703-580-4_14, © Springer Science+Business Media New York 2014
259
260
Naoyuki Uchida et al.
a
f
Mutant
m/m
(Col-0)
Wild type
M/M
(Ler)
Candidate region
Homozygous SNPs
**
*
g
b
F1
(M/m)
c
Remove SNPs in parental accession lines for F1 generation
*
*
F2 individuals selected by phenotype
Homozygotes at the
responsible locus
(m/m)
h
Extract SNPs
• within CDS and intron donor/acceptor sites
• showing canonical EMS-induced nucleotide changes
** ** ** ** ** **
bulk
d
Short reads (e.g.75 bp)
Ratio of
homo-SNPs/hetero-SNPs
*
e
Col-0 reference
genome
*
*
i Examine the effect of each SNP on the gene function
e.g. Thr
Thr
ACG to ACA
Gln
Stop
CAA to TAA
*
Candidate region
Fig. 1 Schematic overview of the definition of EMS-induced causal mutations through the BSA approach. (a) A
mutant in the Col-0 background is crossed with polymorphic Ler. (b) The F1 plants are self-fertilized to produce F2 seeds. (c) Chromosomes derived from Col-0 and Ler accession are represented by gray and white
bars, respectively. Asterisks indicate the EMS-induced causal mutation. Seedlings from the F2 individuals
exhibiting the phenotype of interest are selected and bulked. (d) Short reads from NGS are mapped to the
reference Col-0 genome and SNPs are called. (e) The candidate region is identified by the distribution of SNPs
derived from Ler (solid line). If mutant is derived from non-reference accessions and crossed to Col-0, distribution is opposite (dotted line). (f) Homozygous SNPs (arrowheads) are extracted from the candidate region;
arrows show annotated genes. (g) Background-SNPs are removed. If multiple allelic mutants are available, it
is possible to remove common background SNPs from multiple allelic samples. (h) Candidate SNPs are
extracted using appropriate criteria. (i) Finally, the effects of the selected SNPs on the annotated gene function
are evaluated
usually exist in the genome that hamper the identification of the
causal mutation without additional information. Various genetic
manipulations may provide such information. For instance, rough
mapping with conventional markers of F2 populations prior to NGS
analysis may narrow down the location of the mutation. Alternatively,
“bulked segregation analysis” (BSA) can be employed [8]. In several
reports on NGS-based identification of recessive mutations, BSA has
been used successfully [1–3, 6, 7]; a summary of the method is given
in Fig. 1. In this chapter, procedures for the identification of ethyl
methanesulfonate (EMS)-induced SNPs will be described.
NGS Approach for Detection of EMS-Induced SNPs
261
Table 1
Summary for BSA-approaches to define EMS-induced causal mutations by NGS
Background
acc.a
Crossed
acc.b
WS
Col-0
Col-0
Read
length (bp)
Coverage
Informatics
Ref.
80
75
× 6.3~ × 9.2
CASAVA
+Custom perl scripts
[1]
Ler
500
37
× 22
SHOREmap
[2]
Col-0
Ler
200
50
× 6.2~1777c Custom perl script
[3]
Col-0
Ler
93
36
× 12
MASS
[6]
× 29 ~ 74
NGM
[7]
Col-0
Ler
Number
of F2 bulked
80
d
38 × 2
a
Background accession for mutant isolation
Crossed accession to produce F2 segregates
c
Low coverage reads were used for delimitation of the candidate region. Deep sequencing data (~×1700) was
used for definition of the causal mutation
d
“×2” indicates paired-end sequencing
b
Generally, after isolation of an EMS-induced mutant in one
accession (e.g., Col-0), the mutant is crossed with another polymorphic accession (e.g., Ler), and F1 seeds are produced. Then,
F2 individuals showing the recessive phenotype of interest are
pooled and their bulked genomic DNA is used to prepare an NGS
genomic library. The library is sequenced by NGS to provide short
reads derived from the genome sequence. Basic steps to define the
causal mutation are (1) mapping of short reads to the reference
Col-0 genome, (2) calling SNPs against the Col-0 reference
genome sequence, (3) linkage analysis using the SNPs, and (4)
applying various filters to exclude SNPs which are unlikely causal
mutations. In linkage analysis of cases where Col-0 is the parental
accession that had been subjected to EMS mutagenesis, the linked
region spanning the causal mutation has less La er-type SNPs compared with other regions of the genome. After the delimitation of
the genomic region containing the causal mutation, candidate
SNPs are examined for their nucleotide-substitution type (canonical EMS-type G/C to A/T conversion or its absence) [9] and also
for their effects on the annotated gene (e.g., are non-synonymous
or nonsense mutations induced? are intron acceptor/donor sites
disrupted?). The number of F2 segregants which should be pooled,
the appropriate sequence coverage number which should be
achieved, and the efficiency of methods employed to narrow down
the genomic region containing the causal mutation were examined
in several studies of BSA-based approaches (Table 1). The confirmation of candidate SNPs by conventional Sanger sequencing
effectively eliminates false-positive SNPs due to NGS or following
informatics-analysis errors. If allelic mutations are available, the
262
Naoyuki Uchida et al.
definition of the causative mutation will be facilitated. To confirm
whether the identified SNP really causes the phenotype of the
interest, complementation by corresponding genomic fragments
or evaluation of T-DNA inserted knockout lines should be performed. NGS-based identification of causal mutations could be
applied also to cases where non-reference Arabidopsis accessions
serve as parental lines for mutagenesis [1].
It is theoretically possible to detect other types of mutations by
NGS, including large insertions, deletions, inversions, and translocations (see Note 1). Different types of genome libraries should be
prepared according to the type of genomic alteration. For small
insertions/deletions (a few hundred base pairs), paired-end libraries are sufficient [10], but large-scale structural changes in the
order of kilobases require mate-pair libraries [11]. NGS technologies are applicable to all plant species whose genomes have been
sequenced if genetics approaches can be available.
In this chapter, detailed procedures for the method reported in
ref. 1 are described, including details omitted in the publication, as
well as alternative approaches. The methodology is based on BSA
after crossing a mutant in a non-reference accession, Wassilewskija
(Ws), with the reference Col-0. Single-end reads of 75 bp that
were produced at relatively low coverage (× 6.3 to × 9.2) by wholegenome sequencing with Illumina GAIIx were used to call SNPs
by the CASAVA SNP call pipeline. Linkage analysis with SNPs and
extraction of candidate SNPs for the causal mutation by several
filters contributed to the identification of the causal mutation.
2
Materials
1. Seedlings from F2 individuals exhibiting the phenotype of
interest.
2. Seedlings from the parental accessions used for crossing.
3. TissueLyser (Qiagen).
4. CelLytic PN Isolation/Extraction Kit (Sigma-Aldrich).
5. Plant DNeasy mini kit (Qiagen).
6. MicroTUBES (Covaris).
7. Covaris S2 (Covaris).
8. QIAquick PCR Purification Kit (Qiagen).
9. Elution Buffer (EB): 10 mM Tris–HCl (pH 8.5).
10. 2100 Bioanalyzer (Agilent).
11. DNA 1000 Kit (Agilent).
12. NEBNext DNA Sample Prep Reagent Set 1 (New England
BioLabs).
13. Genomic Adaptor Oligo Mix (Illumina or New England BioLabs).
NGS Approach for Detection of EMS-Induced SNPs
263
14. Certified Low Range Ultra Agarose (Bio-Rad).
15. Mupid electrophoresis system (Advance co., Ltd.).
16. MinElute Gel Extraction Kit (Qiagen).
17. PCR Primers 1.1/2.1 (Illumina).
18. Light Cycler 480 (Roche).
19. KAPA Library Quantification Kit (KAPAbiosystems).
20. Illumina GAIIx (Illumina).
21. TruSeq SR Cluster Kit v2-cBot-GA (Illumina).
22. TruSeq SBS Kit v5-GA (36 cycle) (Illumina).
23. PhiX Control Kit v3 (Illumina).
24. PowerEdge R900 linux server (64 Gb memory, 10 Tb storage;
DELL).
25. CASAVA ver 1.7 software (Illumina).
26. Custom perl scripts.
3
Methods
3.1 Preparing
Samples for the
Genomic Library for
Whole-GenomeSequencing
1. Bulk seedlings from F2 individuals exhibiting the phenotype of
interest (see Note 2). Disrupt samples using the TissueLyser.
Alternatively, grind plant tissues to a fine powder under liquid
nitrogen using a mortar and pestle. Do not allow the sample to
thaw.
2. Enrich nuclei fraction using “Semi-pure Preparation of Nuclei
Procedures” of the CelLytic PN Isolation/Extraction Kit (see
Note 3).
3. Isolate genomic DNA using Plant DNeasy mini kit from the
semi-purified nuclei fraction (see Note 4).
4. Prepare 1 μg DNA in 130 μl TE and shear it in microTUBE
using Covaris S2 at 100-bp setting (see Note 5).
5. Purify DNA using the QIAquick PCR Purification Kit and
elute in 30 μl of EB.
6. Check the distribution of sheared genomic DNA with the
2100 Bioanalyzer according to the manufacturer’s protocol.
1 μl from fragmented solution is analyzed on microfluidic chip
(see Note 6).
7. Prepare DNA library for genome sequencing using the total
amount of purified DNA and NEBNext DNA Sample Prep
Reagent Set 1 according to the manufacturer’s manual with
some modifications. At the adaptor ligation step, use Genomic
Adaptor Oligo Mix as DNA adaptor. After adaptor ligation, we
add an optional step to enrich the optimal length of DNA fragments for genome sequencing: excise the 200–250 bp DNA
fragments from an agarose gel, made with Certified Low
264
Naoyuki Uchida et al.
Range Ultra Agarose, with a clean, sharp knife. For this step,
we use Mupid electrophoresis system (see Note 7). Then,
purify the fragments using the MinElute Gel Extraction Kit
and elute in 15 μl of EB. At the PCR step for enrichment of
the adapter-modified DNA fragment, use PCR Primers 1.1
and 2.1 (Illumina)/Universal PCR primer and Index1 primer
(New England BioLabs) (see Note 8).
8. Purify DNA using the QIAquick PCR Purification Kit and
elute in 30 μl of EB (see Note 9).
9. Check the distribution of the amplified DNAs with the 2100
Bioanalyzer.
10. Quantify the concentration of the library by quantitative PCR
with Light Cycler 480 according to the manufacturer’s manual
(see Note 10).
3.2 Short Read
Sequencing and
Informatic Analysis
for EMS-Induced SNPs
(See Note 11)
3.2.1 Sequencing
with Illumina GAIIx
1. Conduct 75 bp sequencing according to the Illumina GAIIx
operation manual. To create a cluster on the Illumina flowcell
for single-read, the cluster-generation kit (TruSeq SR Cluster
Kit v2–cBot-GA) is used with 8 pM of diluted libraries. 75 bp
sequencing run was conducted with two set of SBS kit (TruSeq
SBS Kit v5–GA [36 cycle]). To check the state of the run, the
control library PhiX Control Kit v3 is used on one lane of the
flowcell.
3.2.2 Alignment of Reads
to a Reference Genome
Sequence
1. Align reads from GAIIx to the reference genome sequence
with the CASAVA software. Reference sequences for CASAVA
are available to Illumina sequencer users from MyIllumina
(https://icom.illumina.com/). The package including the reference sequences is named iGenome (see Note 12).
3.2.3 SNP Calling
1. Call SNPs using CASAVA with default parameters (see Note 13).
Among the CASAVA output files, SNP lists for each chromosome (snps.txt) and the summary file (summary.html) are the
most important for the following analysis.
3.2.4 Linkage
Analysis with the Index
of Enrichment
of Homozygous SNPs
1. Define the chromosome containing the causal SNP in which
the homozygous SNPs derived from the mutant accession are
significantly enriched compared to the other chromosomes (see
Note 14).
3.2.5 SNP Filtering
1. Filter the SNPs using several criteria. Filtering procedures are
performed with the package of perl scripts, “snipSNP,” and
EXCEL (see Notes 15 and 16).
3.2.6 Removal of SNPs
in the Accession Line
Used as Parent of the F1
Generation
1. Remove background (parental) SNPs with perl script
extractSNP.pl which produces a list of mutant-specific SNPs
(see Notes 16–18).
NGS Approach for Detection of EMS-Induced SNPs
265
3.2.7 Extraction of SNPs
Within Gene and Intron
Donor/Acceptor Sites
1. Extract SNPs within gene and intron donor/acceptor sites
with perl script snpinGFF.pl, which extracts SNPs if input
SNPs are located in the regions defined in the GFF file (see
Notes 16, 17, 19–21).
3.2.8 Narrowing Down
the Chromosomal Region
Spanning the Causal SNP
by Linkage Analysis
1. Detect the chromosomal region displaying significant enrichment of homozygous SNPs derived from the mutant accession
with perl script stateSNP.pl. This script counts the number of
SNPs within each window divided by a defined interval (e.g.,
500 kbp) (see Notes 16, 17, 22, and 23).
3.2.9 Extraction of SNPs
Showing Canonical
EMS-Induced SNPs
1. Extract canonical EMS-type SNPs (G to A or C to T) using
EXCEL.
3.2.10 Examination
of Effects of Candidate
Causal SNPs on
Corresponding Gene
Functions
1. Check the effects of the candidate SNPs for the causal mutations on their corresponding gene functions (e.g., are nonsynonymous or nonsense mutations induced? are intron
acceptor/donor sites disrupted?).
3.2.11 Additional
Analyses
If multiple allelic mutants exist, the following analysis is available
(see Note 24):
1. Because multiple allelic mutants presumably harbor causal
mutations in the same gene, extract SNPs which are induced in
the same genes.
2. Removal of “background” SNPs identical in multiple allelic
mutants (see Note 25).
3.3 Confirmation
of the Extracted
Candidate SNPs as
Actual Mutations
1. Amplify the region spanning candidate causal SNPs by genomic
PCR.
2. Conduct Sanger sequencing of the amplified fragments.
3.3.1 Sanger Sequencing
of Candidate Causal SNPs
(See Note 26)
3.3.2 Evaluation
of the Identified Mutations
by Independent Criteria
1. For a final confirmation that the SNPs identified are causative
mutations, one or more of the following experiments should
be performed: (a) evaluation of T-DNA insertion lines, (b)
allelism tests by genetic crosses with preexisting mutants, and
(c) complementation tests by transformation of candidate
genes into the mutant.
266
4
Naoyuki Uchida et al.
Notes
1. In this chapter, the analysis for EMS-derived SNP is described.
Using paired-end reads, small insertions/deletions (indel:2–
20 nt) can be extracted by CASAVA 1.7 software. Large structural variants for indels translocations and inversions can be
detected using specific informatic softwares and suitable
genomic libraries (paired-end and/or mate-pair libraries).
Using free or commercial softwares may be helpful to detect
such structural variants derived from ionizing, fast neutron,
and X-ray radiations. CLEVER with mapping free software,
BWA (BWA: http://bio-bwa.sourceforge.net/; CLEVER:
https://code.google.com/p/clever-sv/) and AVADIS-NGS
(commercial: http://www.avadis-ngs.com/). When other
mutagen than EMS is used (i.e., ionizing radiations), the
experimental conditions and subsequent analysis of data should
be established according to its effect on the chromosomes.
2. Schneeberger et al. [2], Cuperus et al. [6], and Uchida et al. [1]
used 500, 93, and 80 F2 individuals, respectively. To remove
the background SNPs or linkage analysis (see following steps),
it is recommended to sequence the parental accessions used for
crossing. The same procedure may be applicable to the identification of a dominant mutation with optional steps. In the
case of a semidominant mutation, F2 plants displaying the
homozygous phenotype are pooled and the genomic DNAs
are prepped as a bulk for genome sequencing. In the case of a
completely dominant mutation, F2 plants displaying a phenotype of interest are individually frozen and kept at −80 °C (several leaves from each plant would be enough) until the
phenotypic segregation of F3 populations derived from each
F2 individual can be examined. Then, F2 samples determined
to be homozygous at the causal locus by analysis of the F3
generation are pooled and the genomic DNAs are prepped as
a bulk for genome sequencing. Alternatively, following the
examination of phenotypic segregation of the F3 population,
homozygous F3 lines (e.g., all F3 plants derived from an F2
individual showing the mutant phenotype) could be pooled
for sequencing.
3. Without this step, a relatively large population of plastid-derived
genomes will be sequenced, leading to low efficiency of the
detection of short reads corresponding to the nuclear genome.
4. We use two DNeasy columns for DNA isolation from a bulked
pool of 80 F2 individuals (a total of 700 mg fine powder) and
combine the resulting DNA solutions.
5. Duty cycle, 10 %; intensity, 5; cycles/burst, 100; time, 60 s;
bath temperature, 4 °C. This cycle is done ten times. It is
NGS Approach for Detection of EMS-Induced SNPs
267
needed near to 1 h to chill the water bath and 30 min for the
degas process.
6. We routinely use the DNA 1000 kit (Agilent). If the amount
of fragmented DNA is low, High Sensitivity DNA kit (Agilent)
should be used on the bioanalyzer. It is critical to put the kit
solutions at room temperature for 30 min before usage.
7. Mupid electrophoresis system can be obtained also from Helixx
Technologies, Inc. However, we believe that any agaroseelectrophoresis equipment could be used for this procedure.
8. We use 12.5 ng DNA as template in 50 μl reaction buffer and
12 PCR cycles for amplification.
9. If PCR produces extra bands deviating from the expected size,
an additional gel extraction step is recommended.
10. This step is critical for the achievement of maximum cluster
density in the sequencing flowcell. We routinely use the KAPA
Library Quantification Kit.
11. Data analysis for identification of causal SNPs consists of the following three steps: alignment of reads to the reference genome,
SNP calling, and SNP filtering. Different types of software for
these analyses are available both commercially and free. We used
CASAVA ver.1.7 for alignment and SNP calling, which is based
on Bayesian statistics, and custom perl scripts to filter the SNPs
(these can be freely downloaded at http://bsw3.naist.jp/plantglobal/mmb2011/snipSNP.html). Care must be taken if other
software is used, since the algorithms employed and appropriate
parameter settings may vary between different programs.
Alternatively, free software may be useful for mapping (e.g.,
Bowtie; http://bowtie-bio.sourceforge.net/index.shtml [12]
and Burrows-Wheeler Aligner (BWA); http://bio-bwa.sourceforge.net/) [13] and SNP calling (e.g., SAMtools; http://samtools.sourceforge.net/ [14] and GATK; http://www.
broadinstitute.org/gsa/wiki/index.php/Home_Page) [15].
12. See the CASAVA manual for instructions and parameter
settings.
13. Calling of SNPs by CASAVA consists of two steps. First, the
allele call scores are calculated from the base calls and the alignment and read quality scores. Then, SNPs are called based on
the allele call score and read depth. The allele call score should
be larger than 10, and the coverage should be more than ×3.
14. The summary file shows the number of homozygous and heterozygous SNPs on each chromosome.
15. The order of filters can be changed and some filters can be
omitted.
16. “snipSNP” includes three perl scripts: (a) “extractSNP.pl”:
removes background SNPs; (b) “snpinGFF.pl”: extracts SNPs
268
Naoyuki Uchida et al.
within annotated genes; and (c) “stateSNP.pl”: counts numbers
of SNPs within intervals (length of intervals can be adjusted).
Although these perl scripts are optimized for the analysis
of CASAVA output, SNP lists in different formats are accepted
as input with setting options. See the manual of the perl scripts
for further details.
17. The parameters (path, file name) in the commands described
in this chapter are just an example and need to be changed as
appropriate.
18. Use the following command:
“perl extractSNP.pl -t /path/to/mutant_snps.txt / -b /
path/to/background_snps.txt”.
The output directory is created in the current directory. It
includes lists of mutant-specific SNPs and filtered-out SNPs.
19. Type the following command (“ChrN” is the name of the target chromosome in the GFF file; modify this parameter
depending on your target):
“perl snpinGFF.pl -t /path/to/mutant_snps.txt / -g /
path/to/annotation_information.gff -c ChrN”.
The lists of SNPs within annotated gene features (CDS,
5′UTR, and 3′UTR) are output.
To extract SNPs within intron donor/acceptor sites type:
“perl snpinGFF.pl -t /path/to/mutant_snps.txt-g /path/
to/annotation_information.gff / -c ChrN-i exon”.
“snpinGFF.pl” gives information on intron donor/acceptor sites based on information about exons in the GFF file
used.
20. “snpinGFF.pl” is useful for the extraction of SNPs within other
features (e.g., pseudogenes) documented in the GFF file with
variable options.
21. If the GFF file does not include information on exons, it would
be helpful to make a list of intron donor/acceptor sites.
However, in such cases intron donor/acceptor sites adjacent
to UTRs will not be included.
22. Before using “stateSNP.pl,” split the SNPs in the “input SNPs
list” by the type of SNP; homozygous and heterozygous SNPs
are classified as “SNP_diff” and “SNP_het,” respectively, in
the list.
Sort and split in EXCEL or use the “grep” command in
UNIX:
“grep SNP_diff /path/to/target_snps.txt > target_homo_
snps.txt”.
To count SNPs, type:
“perl stateSNP.pl -t /path/to/target_homo_snps.txt >
homo_snps_count.txt”.
NGS Approach for Detection of EMS-Induced SNPs
269
Calculate and plot the ratio of homozygous SNPs to heterozygous SNPs in EXCEL. If a reference ecotype, Col-0, is
used for parental accession, the number of SNPs derived from
the other accession used for crossing (e.g., Ler) should be
decreased in the linked region. When a non-reference accession is employed for mutagenesis, such number should increase
in the neighbor region of causal mutation [1].
23. The causal SNP presumably is located in the narrowed-down
region but is not always found in the “trough of peak.”
24. These filters of SNP data of multiple allelic mutants may
exclude the causal SNPs if the underlying causal mutation is
identical.
25. To remove background SNPs from multiple allelic mutants,
use each of the allelic mutants as virtual background accession
and remove background SNPs as described in Subheading 3.2.6.
Type: “perl extractSNP.pl -t /path/to/allele1_snps.txt /
-b /path/to/allele2_snps.txt”.
In this case, “allele1_snps.txt” and “allele2_snps.txt” are
used as target and background, respectively. A list of allele1specific SNPs is output.
26. It is highly recommended to carry out Sanger sequencing to
remove false-positive SNPs.
Acknowledgements
The authors thank Dr. Taku Ohshima and Mrs. Eiko Nakamoto
(NAIST) for optimization of library preparation and GAIIx manipulation. We also thank Dr. Noriko Inada (NAIST) for the arrangement of the website to download our custom script.
References
1. Uchida N et al (2011) Identification of EMSinduced causal mutations in a non-reference
Arabidopsis thaliana accession by whole
genome sequencing. Plant Cell Physiol
52:716–722
2. Schneeberger K et al (2009) SHOREmap: simultaneous mapping and mutation identification by
deep sequencing. Nat Methods 6:550–551
3. Mokry M et al (2011) Identification of factors
required for meristem function in Arabidopsis
using a novel next generation sequencing fast
forward genetics approach. BMC Genomics
12:256
4. Marti L et al (2010) A missense mutation
in the vacuolar protein GOLD36 causes
organizational defects in the ER and aberrant
5.
6.
7.
8.
protein trafficking in the plant secretory pathway. Plant J 63:901–913
Laitinen RA et al (2010) Identification of a
spontaneous frame shift mutation in a nonreference Arabidopsis accession using whole genome
sequencing. Plant Physiol 153:652–654
Cuperus JT et al (2010) Identification of
MIR390a
precursor
processing-defective
mutants in Arabidopsis by direct genome
sequencing. Proc Natl Acad Sci USA
107:466–471
Austin RS et al (2011) Next-generation mapping of Arabidopsis genes. Plant J 67:715–725
Michelmore RW, Paran I, Kesseli RV (1991)
Identification of markers linked to diseaseresistance genes by bulked segregant analysis: a
270
Naoyuki Uchida et al.
rapid method to detect markers in specific
genomic regions by using segregating populations. Proc Natl Acad Sci U S A 88:9828–9832
9. Greene EA et al (2003) Spectrum of chemically induced mutations from a large-scale
reverse-genetic screen in Arabidopsis. Genetics
164:731–740
10. Holt RA, Jones SJ (2008) The new paradigm
of flow cell sequencing. Genome Res
18:839–846
11. Pang AW et al (2010) Towards a comprehensive structural variation map of an individual
human genome. Genome Biol 11:R52
12. Ben L et al (2009) Ultrafast and memoryefficient alignment of short DNA sequences to
the human genome. Genome Biol 10:R25
13. Heng L, Richard D (2009) Fast and accurate
short read alignment with Burrows–Wheeler
transform. Bioinformatics 25:1754–1760
14. Heng L et al (2009) The sequence alignment/
Map format and SAMtools. Bioinformatics
25:2078–2079
15. Mark AD et al (2011) A framework for variation discovery and genotyping using nextgeneration DNA sequencing data. Nat Genet
43:491–498
Chapter 15
Arabidopsis Transformation with Large Bacterial
Artificial Chromosomes
Jose M. Alonso and Anna N. Stepanova
Abstract
The study of a gene’s function requires, in many cases, the ability to reintroduce the gene of interest or its
modified version back into the organism of choice. One potential caveat of this approach is that not only
the coding region but also the regulatory sequences of a gene should be included in the corresponding
transgenic construct. Even in species with well-annotated genomes, such as Arabidopsis, it is nearly impossible to predict which sequences are responsible for the proper expression of a gene. One way to circumvent this problem is to utilize a large fragment of genomic DNA that contains the coding region of the
gene of interest and at least 5–10 kb of flanking genomic sequences. To facilitate these types of experiments, libraries harboring large genomic DNA fragments in binary vectors have been constructed for
Arabidopsis and several other plant species. Working with these large clones, however, requires some special precautions. In this chapter, we describe the experimental procedures and extra cautionary measures
involved in the identification of the clone containing the gene of interest, its transfer from E. coli to
Agrobacterium, and the generation, verification, and analysis of the corresponding transgenic plants.
Key words TAC, Transformation, Arabidopsis, T-DNA, DNA deletions, Electroporation, Agrobacterium
1
Introduction
The ability to introduce specific sequences into the genome of an
organism is an essential tool to dissect the function not only of the
individual genes but also of the pathways and networks in which
these genes act [1]. Perhaps, the two most common applications of
these types of experimental approaches are (1) the phenotypic
complementation of a mutant by the wild-type copy of the corresponding dysfunctional gene and (2) the addition of tags or other
types of sequence alterations to the gene of interest to facilitate
subsequent downstream functional analysis (e.g., to investigate
subcellular localization or spatial-temporal expression patterns). In
an ideal situation, these modifications would involve targeted
replacement of the endogenous sequences by means of homologous recombination. In most plant species, including Arabidopsis,
Jose J. Sanchez-Serrano and Julio Salinas (eds.), Arabidopsis Protocols, Methods in Molecular Biology, vol. 1062,
DOI 10.1007/978-1-62703-580-4_15, © Springer Science+Business Media New York 2014
271
272
Jose M. Alonso and Anna N. Stepanova
Table 1
List of large-insert binary-vector libraries readily available for Arabidopsis
Vector name # Clones # Clones mapped DNA source Available from
pYLTAC17
36,864
8,223
Col
Genome enterprise “http://www.genomeenterprise.com/”
pYLTAC7
10,749
110
Col
ABRC “http://abrc.osu.edu/”
PAC P1
9,080
300
Col
ABRC “http://abrc.osu.edu/”
BIBAC2
11,520
Ler
ABRC “http://abrc.osu.edu/”
–
this approach is, however, not practical due to the extremely low
frequency of homologous recombination events during the
integration of foreign DNA into the plant genome. The most common alternative is, therefore, the Agrobacterium-mediated random
integration of Transfer-DNA (T-DNA) in the genome of a plant.
A wide variety of vectors compatible with this Agrobacterium
transformation have been developed for a large array of different
purposes [2–4]. For example, binary vectors have been engineered
to carry large DNA fragments [5] ranging in size from tens to up
to a few hundred kilobases. In this chapter, we will focus on working with transformation-competent bacterial artificial chromosomes, or TACs. Importantly, several Arabidopsis genomic libraries
have been constructed using these vectors and are available to the
plant community (Table 1). These specialized libraries are ideal for
complementation studies [5], where large genomic intervals can
be easily covered using only a handful of these large clones.
However, the utility of these libraries is not limited to the complementation studies. Several additional applications for these large
clones in gene functional studies have been recently reported in
Arabidopsis [6]. Precise modifications of specific sequences in the
clone, such as the insertion of a fluorescent tag in a particular location of a gene of interest or the introduction of a desired single
nucleotide change, have significantly widened the potential utility
of these types of genomic libraries [6]. There are several key advantages of using these large clones, both in the complementation
studies (where the greater size of these clones makes it possible to
scan larger regions of the genome) and in the gene functional
approaches (where the presence of large fragments of DNA flanking the gene of interest ensures the presence of all regulatory
sequences). Nevertheless, there are also a few drawbacks that have
precluded a more widespread use of these types of libraries. Thus,
for example, only for some of these libraries (Table 1), the exact
sequence content of each clone is known and the coverage is sufficiently high to make the general use of the library practical.
Another reason for the limited use of these types of clones is the
low efficiency and inadequacy of the standard protocols (that were
Arabidopsis Transformation with BACs
273
originally established and optimized for the manipulation of smaller
binary vectors) when directly applied to the much larger TAC
clones. Finally, presumably as a consequence of the two previous
points, to date there is only a handful of examples in the literature
of the successful utilization of these genomic libraries in Arabidopsis,
making it difficult for the general research community to assess the
true potential and limitations of these tools. In this chapter, we
describe the experimental procedures and common problems that
may arise for each step of the protocol, from the transfer of the
clones from E. coli DH10B to Agrobacterium to the transformation of Arabidopsis and the analysis of the resulting transgenic
plants (Fig. 1). All of the examples provided are based on the use
of the JAtY library of TAC clones. This library was chosen for three
main reasons: the library is publically available, the source of the
genomic DNA is from the Columbia accession, and, most importantly, end-sequenced clones cover more than >90 % of the
Arabidopsis genome [6].
2
Materials
2.1 Transfer of TAC
Clones from E. coli
to Agrobacterium
1. Luria-Bertani (LB) broth: 10 g/L tryptone, 5 g/L yeast
extract, 10 g/L NaCl. Sterilize by autoclaving.
2. LB-agar medium: LB broth supplemented with 15 g/L agar.
Sterilize by autoclaving.
3. 15-mL sterile plastic culture tubes.
4. Temperature-controlled shaker incubator.
5. Kanamycin (100 mg/mL stock in water). Sterilize by filtering.
6. Gentamicin (25 mg/mL stock in water). Sterilize by filtering.
7. Sterile plastic Petri dishes (100 × 15 and 150 × 15 mm).
8. SOB medium: 20 g/L tryptone, 5 g/L yeast extract, 0.5 g/L
NaCl. Adjust pH to 7.5 with 1 M KOH. Sterilize by
autoclaving.
9. 10 % v/v glycerol. Sterilize by autoclaving.
10. Refrigerated centrifuge (Sorvall RC-5B) and rotor (Sorvall
SS-34).
11. Transparent 50-mL Nalgene polypropylene tubes (3118-0050
Oak Ridge). Sterilize by autoclaving.
12. 9″-long glass Pasteur pipets. Sterilize by autoclaving.
13. Alkaline Lysis Solution I: 50 mM glucose, 10 mM EDTA pH 8.0,
25 mM Tris-HCl pH 8.0, 4 mg/mL lysozyme Sigma L-6876.
14. Alkaline Lysis Solution II: 0.2 N NaOH, 1 % SDS.
15. Alkaline Lysis Solution III: 3 M acetate, 5 M potassium, pH
4.8. For 100 mL, weight 29.5 g of potassium acetate, bring to
88.5 mL with diH2O and add 11.5 mL of glacial acetic acid.
274
Jose M. Alonso and Anna N. Stepanova
Order the TAC clone
from the stock center
Select a TAC clone containing the
gene of interest
Steak the E. coli
DH10B strain
harboring TAC
Isolate TAC DNA
Select transformants
Perform colony PCR with
gene-specific primers
Grow PCR-positive colony
Electroporate TAC
into Agrobacterium
Perform colony PCR with
gene-specific primers
Transform
Arabidopsis
Select transformants
Collect
Agrobacterium
cells in glucose
solution
Solid media
Grow
Agrobacterium
in
Liquid media
Test by PCR the
integrity of the T-DNA
RB Basta R
T-DNA
SacB LB
GeneF! SacBF!
SacBR!
-C
GeneF+SacBR
SacBF+SacBR
~800 bp!
+C
L1
L2
L3
L4
L5
L6
Test
Primers
L7
Functionally characterize the
positive transgenic lines
(L4 and L5)
Fig. 1 Schematic representation of the steps involved in the gene functional characterization using JAtY TAC
clones. The procedure starts with the selection of the TAC clone containing the gene of interest among a series
of overlapping TACs that span a determined chromosomal region using the web tools at “http://
Arabidopsislocalizome.org/” and “http://atidb.org.” Once the TAC(s) have been identified, they can be ordered
from the corresponding stock centers; in the case of the JAtY clones, they should be ordered from the Genome
Enterprise “http://www.genome-enterprise.com.” The identity of the received clones is then tested using
gene-specific primers designed for a gene predicted to be contained in the TAC of interest, ideally, positioned
only 1 or 2 kb away from the LB end of the clone. Next, TAC DNA is extracted from E. coli and transferred to
Agrobacterium cells. After confirming the presence of the desired TAC clone, the Agrobacterium strain carrying
the TAC clone is propagated either in liquid or in solid media. Agrobacterium cells are resuspended in a glucose/detergent solution and used for floral dip transformation. After selecting Basta-resistant plants, SacB and
gene-specific primers are used to test the integrity of the T-DNA inserted in the plant genome. By using the
SacBF and SacBR primers, a PCR product of approximately 800 bp should be obtained. Although this PCR test
has been shown to be a good indication of the integrity of the T-DNA, additional PCR with the gene-specific
primer and the SacB should be carried out to rule out possible contamination. In the example illustrated in the
figure, line L2 corresponds to a contamination originating from a transgenic (Basta-resistant) plant that carries
the SacB gene but does not harbor the correct TAC clone. Those plants that are Basta resistant but do not carry
the SacB gene (lines L1, L3, L6, L7) are likely to harbor truncated T-DNAs. Only the transgenic lines that have
been confirmed with both sets of primers (lines L4 and L5) had incorporated the full-length TAC of interest in
their genome and can then used in the desired gene functional studies
Arabidopsis Transformation with BACs
275
16. 95–100 and 70 % ethanol.
17. Tabletop centrifuge with a rotor for microcentrifuge tubes.
18. Microcentrifuge plastic tubes (1.5 mL).
19. Electroporator with the capability to control resistance, capacitance, and voltage (Bio-Rad gene pulser and pulse controller
units or equivalent).
20. Electroporation cuvettes (1 mm gap).
21. A pair of gene-specific primers complementary to the region of
TAC near the left border (LB) of the T-DNA (see Note 1).
2.2 Plant
Transformation
and Selection
of Transgenic Plants
1. Soil: 50 % Sun Gro (Sunshine), 50 % Fafard 4P-Mix or
equivalent.
2. Germination trays: 21″ × 11″ × 1 ¼″, Hummert International
or equivalent.
3. Propagation dome: 21″ × 11″ Hummert International or
equivalent.
4. Square plastic pots: 4 × 6 pots from Hummert International or
equivalent.
5. Plant transformation solution: 5 % glucose, 200 μL/L Silwet
L-77.
6. Seed sterilization solution: 50 % Bleach, 100 μL/L Triton
X-100 (the detergent prevents seeds from clumping and aggregating together).
7. AT media: 1× Murashige & Skoog (MS) salts, 1 % sucrose,
adjust pH with 1 M KOH to 6.0, then add Bacto Agar (Difco)
to 0.7 % final concentration and autoclave.
8. Disposable 50-mL centrifuge tubes.
9. Basta resistance selection media: prepare AT media, sterilize by
autoclaving, cool to ~45 °C, add phosphinothricin [PPT, glufosinate ammonium, GoldBio] to the final concentration of
20 mg/L (the stock can be prepared at 20–100 mg/mL in
water and filter sterilized), pour media at 50–60 mL per
150 mm Petri dish.
10. Top agarose: 0.7 % low melting point agarose supplemented
with 20 mg/L phosphinothricin and 300 mg/L Timentin (to
inhibit growth of Agrobacterium on T1 transgenics).
11. Fine-pointed forceps.
2.3 Analysis
of Transgenics
1. CTAB buffer: 1.4 M NaCl, 20 mM EDTA pH8.0, 100 mM
Tris-HCl pH 8.0, 3 % CTAB (cetyltrimethylammonium
bromide).
2. Homemade scoop (cut off the bottom of a microfuge tube
with scissors or a razor blade and glue the bottom piece to the
hot tip of a glass Pasteur pipet).
276
Jose M. Alonso and Anna N. Stepanova
3. Ivoclar Vivadent shaker (or equivalent).
4. 1-mm diameter glass beads, BioSpec.
5. SacB primers (SacBF 5′-TGTAAAACAAGCCACAGTTC-3′
and SacBR 5′-AATAAAGATTCTTCGCCTTG-3′).
6. General PCR reagents: dNTPs, 10× PCR buffer with Mg, Taq
polymerase.
7. Thermocycler.
8. Gel electrophoresis setup.
9. 10 mg/mL ethidium bromide.
3
Methods
3.1 Transfer of TAC
Clones from E. coli
to Agrobacterium
1. Identify the JAtY TAC clone containing the gene of interest
or corresponding to the desired genomic region using the
tools at the Arabidopsis thaliana Integrated Database http://
atidb.org/cgi-perl/gbrowse/atibrowse/ or at http://
Arabidopsislocalizome.org/.
2. Order the TAC of interest from the Genome Enterprise,
http://www.genome-enterprise.com.
3. Re-streak the bacterial strain received on LB-agar supplemented with 25 mg/L kanamycin and incubate overnight at
37 °C.
4. Confirm that the strain harbors the TAC clone of interest by
colony PCR using gene-specific primers: resuspend a single
colony in 20 μL of water and use 2 μL of cells as a template in
a 20 μL PCR reaction mix (see Note 2).
5. (Day 1) Streak Agrobacterium strain GV3101 (pMP90) [7] or
equivalent on LB-agar plates supplemented with 25 mg/L
gentamicin and incubate at 28 °C for 2–3 days (see Note 3).
6. (Day 3) Streak JAtY E. coli clone on LB-agar plates supplemented with 25 mg/L kanamycin and incubate at 37 °C overnight (see Note 3 above).
7. (Day 4) Inoculate 3 mL of LB supplemented with gentamicin
(25 mg/L) in a 15-mL sterile plastic culture tube with a mixture of 2–3 colonies of actively growing Agrobacterium cells.
Incubate at 28 °C overnight with continuous shaking (see
Note 4).
8. (Day 4) Inoculate 3 mL of LB supplemented with kanamycin
(25 mg/L) in a 15-mL sterile plastic culture tube with a single
colony of actively growing E. coli cells harboring the JAtY
clone of interest. Incubate at 37 °C overnight with continuous
shaking (see Note 4).
Arabidopsis Transformation with BACs
277
9. (Day 5) Inoculate 50 mL of SOB supplemented with gentamicin
(25 mg/L) in a 250-mL Erlenmeyer flask with 1 mL of the
overnight Agrobacterium culture. Incubate at 28 °C for 4–5 h
until the culture reaches OD 600 ~0.6.
10. (Day 5) While the Agrobacterium culture is growing, isolate
JAtY TAC DNA from the overnight culture. Prepare fresh
Alkaline Lysis Solutions I and II. Chill Solution I on ice and
keep Solutions II and III (with Solution III made in advance)
at room temperature. Transfer 1.5 mL of the overnight culture
to a microcentrifuge tube. Spin at 14,000 rpm (20,817 × g) for
1 min in a tabletop microcentrifuge. Aspirate all the liquid.
Add 100 μL of Solution I and resuspend cells by pipetting up
and down the solution until all the cells are in suspension. Add
200 μL of Solution II. Mix by gently inverting the tube 8–10
times. Immediately add 150 μL of Solution III and again mix
by gently inverting the tube 8–10 times. Centrifuge for 6 min
at 14,000 rpm (20,817 × g) in a tabletop microcentrifuge.
Transfer supernatant to a new 1.5-mL centrifuge tube with a
1 mL pipetman (see Note 5).
11. (Day 5) Slowly add 1 mL of 95–100 % room-temperature ethanol, mix by gently inverting the tube 8–10 times, and spin for
6 min at 14,000 rpm (20,817 × g). Remove supernatant by
aspiration, and wash pellet once with 70 % ethanol at room
temperature. Aspirate supernatant and air-dry the pellet for
~2–3 min. Add 30 μL of diH2O and allow the DNA to sit and
dissolve for 2–3 h at room temperature (do not vortex or pipet
the DNA to prevent mechanical damage).
12. (Day 5) Place freshly grown Agrobacterium culture (that has
reached OD of ~0.6) on ice for 5–10 min before starting the
preparation of electrocompetent cells. Transfer the entire
Agrobacterium culture to a prechilled 50-mL Nalgene centrifuge tube. Spin cells at 4 °C for 5 min at 2,200 g in a Sorvall
SS-34 rotor (or equivalent). Make sure that the centrifuge and
the rotor have been precooled to 4 °C. Quickly pour off supernatant by inverting the tube. Resuspend cells by gently stirring
the tube in an ice-cold water bath. Fill the tube with sterile
ice-cold 10 % glycerol. Centrifuge at 4 °C for 10 min at 5,000 g
in the Sorvall SS-34 rotor. Quickly pour off supernatant by
inverting the tube. Resuspend cells by gently stirring the tube
in ice-cold water bath. Fill the tube with ice-cold 10 % glycerol.
Centrifuge at 4 °C for 10 min at 5,000 g in the Sorvall SS-34
rotor. Remove the 10 % glycerol by aspiration with a glass
Pasteur pipet. Resuspend the cells in the 10 % glycerol remaining in the tube walls keeping the cells always on ice.
13. (Day 5) Centrifuge the TAC DNA (that has been dissolving
at room temperature) for 5 min at 14,000 rpm (20,817 × g)
278
Jose M. Alonso and Anna N. Stepanova
and transfer 7 μL to a new 1.5-mL centrifuge tube. Place the
DNA on ice.
14. Place 1-mm-gap electroporation cuvette on ice for 2–3 min.
15. Transfer 40 μL of Agrobacterium competent cells to the tube
with 7 μL of TAC DNA.
16. Immediately transfer the mix of DNA and cells to the electroporation cuvette.
17. Electroporate cells at 1,250 V, 100 Ω, and 25 μF [8].
18. Add 1 mL of room-temperature LB broth into the electroporation cuvette and then transfer the cell suspension to a new
15-mL culture tube.
19. Recover cells for 1 h 30 min at 28 °C in the shaker incubator
at 200 rpm.
20. Transfer the culture to a 1.5-mL centrifuge tube and collect
cells by spinning for 1 min at 14,000 rpm (20,817 × g) at room
temperature.
21. Remove most of the liquid, leaving ~50–100 μL of the LB, and
resuspend the cells in this leftover media by pipetting.
22. Spread cells on an LB-agar plate supplemented with kanamycin
(25 mg/L), allow the media to get fully absorbed, and place the
plates at 28 °C. Colonies will start appearing after 3–5 days.
23. Test Agrobacterium transformants for the presence of the
desired TAC using the gene-specific primers (see Note 6).
3.2 Plant
Transformation
and Selection
of Transgenics
1. Surface-sterilize seeds by placing them in the seed surface sterilization solution for 10 min and occasionally inverting/shaking the tubes to fully resuspend the seeds. After 10 min, allow
the seeds to settle by gravity, remove the bleach solution by
aspiration, and wash seeds thoroughly 3 times with sterile
water, each time fully resuspending the seeds (for small
amounts [i.e., ~200] of seeds this process can be done in 1.5mL microcentrifuge tubes).
2. Resuspend the seeds in melted and precooled sterile 0.7 % low
melting point agarose in water, and plate by spreading the
seeds (using a 200 μL pipetman with a sterile wide-bore tip) on
the surface of AT media plates supplemented with 20 mg/L
PPT (see Note 7).
3. Stratify the seeds in the plates at 4ºC for 3 days to equalize
germination. After the cold treatment, light-treat the plates
with seeds for about 2 h at room temperature to improve germination. Place the plates with seeds horizontally in a 22ºC
dark incubator. After about 72 h, transfer the plates to a growth
chamber with constant light for 3–5 days before transplanting
individual seedling to soil with forceps. With the back of the
forceps, make half a centimeter deep holes in the moist soil,
Arabidopsis Transformation with BACs
279
one hole in each corner of the pot and one in the middle. Place
the seedling in the hole, so the root but not the cotyledons get
under the soil surface when you close the hole with the back of
the forceps. Cover the tray with the transparent propagation
dome (see Note 8).
4. Grow the plants under 16 h light/ 8 h dark cycle at 20 °C.
After ~ 2–3 weeks (when plants are starting to bolt) gradually
remove the propagation dome by shifting it ~2 cm to one side
the day before you plan to remove the dome completely.
5. Six days before plant transformation streak the Agrobacterium
strain harboring the TAC of interest on a 100-mm LB plate
supplemented with kanamycin and gentamicin, and incubate
the culture at 28 °C. After three days collect the cells corresponding to 5–10 colonies, resuspend them in 300 μL of LB
and spread the mixture in a large 150-mm LB plate supplemented with kanamycin and gentamicin. Prepare 2–3 plates
per clone.
6. Collect the Agrobacterium cells from 2–3 large plates by scraping the cells with the plastic tip of a 200-ul pipetman that has
been bent about 1 cm from the thinner end into an L shape.
All the cells from the 2–3 plates are resuspended in about
300 mL of transformation solution (see Note 9).
7. Pour the Agrobacterium cells resuspended in the transformation solution into a 250 mL glass beaker that is wide enough
to allow all of the inflorescences from one pot (5 plants) to fit
in, but small enough so the plastic pot does not fall inside the
solution. Take a pot of plants, carefully invert it upside down,
so the soil and plants do not detach from the plastic pot.
Submerge all of the inflorescences into the Agrobacterium suspension, and after a few seconds, lift the pot and dip the inflorescences again.
8. Place the pots with the dipped plants in a horizontal position
in a clean plastic tray and cover it with the propagation dome
(see Note 10). Transfer the tray with the plants back to the
growth chamber.
9. The day after transformation shift the propagation dome about
1 in. to the side, and the day after remove it completely and
return the pots with the plants to a vertical position and water
the plants if necessary (see Note 11).
10. Continue watering the plants until they finish flowering and
the seed pots start to dry. Let the plants dry completely before
collecting the seeds.
11. Collect the seeds by carefully putting the plants to the side on
a clean sheet of paper, help releasing the seeds from the siliques
by gently rubbing the dry siliques with the fingers tips. Use a
plastic mesh to clean the seeds from the plant and soil debris.
280
Jose M. Alonso and Anna N. Stepanova
12. Transfer about 600 mg of seeds to a 50-mL conical plastic
tube. Sterilize seeds for 10–15 min using 30–40 mL of seed
sterilization solution. Make sure the seeds are fully resuspended.
Invert/shake the tubes with seeds occasionally during the
10–15 min of sterilization. Let the seeds sediment and remove
as much bleach solution as possible using a vacuum aspirator.
13. Wash the seeds 3–5 times with ~50 mL sterile di H2O. After
each wash, remove as much water as possible.
14. To the seed suspension in di H2O, add PPT and Timentin to
20 mg/L and 300 mg/L final concentration, respectively. For
example, if the residual volume of seeds in water is 5 mL, add
5 μL of 20 mg/mL PPT and 300 mg/mL Timentin. Timentin
inhibits Agrobacterium growth that survived bleach sterilization under the seed coat.
15. Cold-treat the tubes with seeds at 4ºC for 2–3 days to equalize
germination.
16. Equilibrate the tubes with seeds to room temperature for
15–30 min and add to the seed suspension melted, precooled
0.7 % top agarose in water (see Note 12).
17. Use plastic single-use 10-mL pipettes to uniformly distribute
the seed/agarose suspension on the top of AT plates supplemented with 20 mg/L PPT. Plate up to 8,000 seeds (~0.2 g
dry weight) resuspended in 5–7 mL of top agarose per each
150-mm Petri plate containing 50–60 mL AT media supplemented with 20 mg/L PPT.
18. Put the plates in the light for 1–2 h at room temperature to
improve germination and then place the plates in the dark
incubator at 22ºC for 3 days.
19. After 3 days in the dark, move the plates to the light for 2–5
days. Check the plates periodically. Basta-resistant plants
(transformants) will develop green-colored cotyledons upon
light exposure. Sensitive plants (untransformed) will remain
bleached or will fail to germinate altogether.
20. Transplant Basta-resistant plants to soil and propagate (see
Note 13).
3.3 Analysis
of the Transgenics
1. Place 100 μL of 1-mm glass beads into microcentrifuge tubes
using a homemade scoop.
2. Harvest one healthy leaf of about 2 cm in length into microfuge
tubes prefilled with ~100 μL glass beads, wiping off the forceps
between individuals (see Note 14).
3. Store tissues at −20 or −80 °C until needed or directly proceed
to the next step.
4. Freeze samples in liquid nitrogen by resting them on the
surface of a foil cup partially submerged in liquid nitrogen.
Arabidopsis Transformation with BACs
281
5. Grind frozen samples for 5–6 s in a Vivadent shaker.
6. Add 250 μL of CTAB buffer.
7. Grind for 5–6 s in a Vivadent shaker, place samples in a rack at
room temperature while shaking the rest of the samples.
8. Incubate the entire rack of samples for 30 min at 65 °C.
9. Cool samples to room temp for ~10 min.
10. Add 250 μL of chloroform.
11. Mix the samples by vigorously shaking the tubes.
12. Spin at 14,000 rpm (20,817 × g) for ~ 10 min at room
temperature.
13. Transfer 200 μL of upper phase into a tube prefilled with
250 μL of isopropanol.
14. Mix samples by inversion 3–4 times.
15. Spin at 14,000 rpm (20,817 × g) for ~ 10 min at room
temperature.
16. Aspirate supernatant being careful not to touch and suck out
the DNA pellet.
17. Wash pellet with 70 % EtOH.
18. Spin at 14,000 rpm (20,817 × g) for ~ 10 min at room
temperature.
19. Aspirate supernatant being careful not to touch and suck out
the DNA pellet.
20. Air-dry the pellet for about 10 min.
21. Resuspend DNA in 100–400 μL of deionized H2O. Shake in
the Vivadent shaker for 5 s and spin to collect any insoluble
material on the bottom of the tube. Use 1–2 μL DNA per
10–20 μL PCR reaction.
22. Test for the integrity of the T-DNA using the SacBF and SacBR
primers. The presence of an ~800 bp band is a good indicator
of a complete copy of the T-DNA in the plant genome (Fig. 1)
(see Note 15).
4
Notes
1. The F (forward) sequence in the ATIDB database (see below)
corresponds to the Arabidopsis genomic sequence adjacent to
the RB side of the T-DNA, whereas the R (reverse) sequence
corresponds to the LB side of the T-DNA.
2. Ideally, the gene-specific primers should be designed complementary to the 1–2 kb region of Arabidopsis genomic DNA
closest to the LB. This is easy to determine as the TAC-end
282
Jose M. Alonso and Anna N. Stepanova
sequences in the ATIDB labeled as “R” were obtained by
sequencing from the LB side of the vector.
3. It is important to start both the Agrobacterium culture to prepare electrocompetent cells and the E. coli culture to isolate
TAC DNA from fresh actively growing colonies. Starting the
cultures from older cells or colonies that have been stored in
the fridge will reduce the efficiency of transformation. Other
standard laboratory Agrobacterium strains can be used, but it
is desirable that they are RecA− to avoid potential rearrangement problems in the genomic DNA.
4. The Agrobacterium and the E. coli overnight cultures can be
incubated in the same shaker at 32 °C if desired.
5. Be very gentle when pipetting solution containing the TAC
clones. Mechanical damage of these large DNA molecules will
introduce nicks in the DNA causing them to lose the supercoiled conformation and making electroporation extremely
inefficient.
6. Colony PCR of primary transformants is prone to false positives,
probably due to the presence of trace amounts of TAC DNA
used in the transformation. Colonies giving a strong amplification with the gene-specific primers should be re-streaked in an
LB plate supplemented with kanamycin (25 mg/L) and individual colonies tested again by PCR. Colonies that pass this second test should be considered true positives.
7. We typically use 80–100 μL of top agarose per up to 100 seeds
and spread the entire volume per 1/10 sector or a larger area
of a standard 100 mm Petri dish.
8. It is very important to prevent any damage to the seedlings
with the forceps. By transplanting seedlings pre-germinated in
plates in this manner, it is possible to select seedlings that germinated at the same time and look evenly healthy. This also
allows a very uniform distribution of the plants in the soil pots.
Using this transplanting method, healthy plants of uniform
size and synchronized bolting time can be obtained, which is
crucial for achieving good plant transformation efficiency.
9. It is very important to use glucose instead of sucrose in the
plant transformation solution as many JAtY clones are able to
express the SacB gene in Agrobacterium (even if they cannot in
E. coli) and the SacB protein can convert sucrose to a toxic
product. Therefore, sucrose in the transformation solution
may make Agrobacterium sick and result in a dramatic reduction of the plant transformation efficiency.
10. It is important to keep the plants covered immediately after
dipping to maintain high humidity.
11. It is important to transition the plants from high humidity to a
normal environment gradually to avoid damage to the young
flower buds.
Arabidopsis Transformation with BACs
283
12. Be careful not to use hot top agarose, as it will kill the seeds.
On the other hand, if the agarose is too cool, it will solidify
when mixed with the room-temperature seed suspension and
make clumps. Use 2–3 volumes of top agarose per each seed
suspension volume. For example, add 10–15 mL of 0.7 % top
agarose to 5 mL seed suspension.
13. The plant transformation efficiency with the JAtY clones,
although highly variable, is significantly lower than that
obtained with regular binary vectors. It is a good idea to determine the transformation efficiency by plating ~10,000 seeds
on a single 150-mm plate. When estimating the number of
lines obtained in a transformation experiment, one needs to
keep in mind that up to 75 % of the resistant plants may have
truncated T-DNAs.
14. The presence of senescent petals on the surface of the leaf or
poor cleaning of the forceps between samples may cause PCR
false positives.
15. It is important to use as a negative control DNA from an
untransformed wild-type plant. In our experience [6] the presence of the SacB in plants that are Basta resistant is diagnostic
of the presence of a whole T-DNA copy in the genome of the
plant. This test, however, cannot discriminate between different TAC clones; thus the presence of cross-contaminations
from plants transformed with a different TAC clones will still
result in a positive SacB amplification in a Basta-resistant plant.
A gene-specific primer for a sequence close to the LB end of
the TAC clone and the SacBF primer could be used to determine the presence of the T-DNA corresponding to a specific
TAC clone (Fig. 1).
References
1. Alonso JM, Ecker JR (2006) Moving forward in
reverse: genetic technologies to enable genomewide phenomic screens in Arabidopsis. Nat Rev
Genet 7:524–536
2. Lee LY, Gelvin SB (2008) T-DNA binary vectors
and systems. Plant Physiol 146:325–332
3. Liu Y, Mitsukawa N, Vazquez-Tello A, Whittier
RF (1995) Generation of a high-quality P1
library of Arabidopsis suitable for chromosome
walking. Plant J 7:351–358
4. Chang Y-C, Henriquez XH, Preuss DP,
Copenhaver GC, Zhang HZ (2003) A planttransformation-competent BIBAC library from
the Arabidopsis thaliana Landsberg ecotype for
functional and comparative genomics. Theor
Appl Genet 106:269–276
5. Liu YG, Shirano Y, Fukaki H, Yanai Y, Tasaka M,
Tabata S, Shibata D (1999) Complementation of
plant mutants with large genomic DNA fragments by a transformation-competent artificial
chromosome vector accelerates positional cloning. Proc Natl Acad Sci U S A 96:6535–6540
6. Zhou R, Benavente LM, Stepanova AN, Alonso
JM (2011) A recombineering-based gene tagging
system for Arabidopsis. Plant J 66:712–723
7. Farrand SK, O'Morchoe SP, McCutchan J (1989)
Construction of an Agrobacterium tumefaciens
C58 recA mutant. J Bacteriol 171:5314–5321
8. Sheng Y, Mancino V, Birren B (1995)
Transformation of Escherichia coli with large
DNA molecules by electroporation. Nucleic
Acids Res 23:1990–1996
Chapter 16
Global DNA Methylation Analysis Using Methyl-Sensitive
Amplification Polymorphism (MSAP)
Mahmoud W. Yaish, Mingsheng Peng, and Steven J. Rothstein
Abstract
DNA methylation is a crucial epigenetic process which helps control gene transcription activity in eukaryotes.
Information regarding the methylation status of a regulatory sequence of a particular gene provides important knowledge of this transcriptional control. DNA methylation can be detected using several methods,
including sodium bisulfite sequencing and restriction digestion using methylation-sensitive endonucleases.
Methyl-Sensitive Amplification Polymorphism (MSAP) is a technique used to study the global DNA methylation status of an organism and hence to distinguish between two individuals based on the DNA
methylation status determined by the differential digestion pattern. Therefore, this technique is a useful
method for DNA methylation mapping and positional cloning of differentially methylated genes. In this
technique, genomic DNA is first digested with a methylation-sensitive restriction enzyme such as HpaII,
and then the DNA fragments are ligated to adaptors in order to facilitate their amplification. Digestion
using a methylation-insensitive isoschizomer of HpaII, MspI is used in a parallel digestion reaction as a
loading control in the experiment. Subsequently, these fragments are selectively amplified by fluorescently
labeled primers. PCR products from different individuals are compared, and once an interesting polymorphic locus is recognized, the desired DNA fragment can be isolated from a denaturing polyacrylamide gel,
sequenced and identified based on DNA sequence similarity to other sequences available in the database.
We will use analysis of met1, ddm1, and atmbd9 mutants and wild-type plants treated with a cytidine analogue, 5-azaC, or zebularine to demonstrate how to assess the genetic modulation of DNA methylation in
Arabidopsis. It should be noted that despite the fact that MSAP is a reliable technique used to fish for
polymorphic methylated loci, its power is limited to the restriction recognition sites of the enzymes used
in the genomic DNA digestion.
Key words DNA methylation, MSAP, Mutant lines, 5-azaC and zebularine
1
Introduction
DNA methylation is an important epigenetic modification which
usually takes place through the covalent attachment of a methyl
group to the ring carbon 5 of the cytosine (C) in DNA without
affecting the basic nucleotide sequence (Fig. 1). Methylated cytosines
that are followed by guanines (G) are annotated as CpG, in which C
Jose J. Sanchez-Serrano and Julio Salinas (eds.), Arabidopsis Protocols, Methods in Molecular Biology, vol. 1062,
DOI 10.1007/978-1-62703-580-4_16, © Springer Science+Business Media New York 2014
285
286
Mahmoud W. Yaish et al.
Fig. 1 Chemical structure of cytidine, 5-methylcytidine, 5-azaC, and zebularine
binds to G by a phosphodiester bond (p) rather than the triple
hydrogen bond between C and G in double-stranded DNA [1].
DNA methylation plays an important role in controlling gene expression in eukaryotes and it is typically associated with transcriptional
gene repression [2, 3]. Determination of DNA methylation at particular locus provides important information on the gene expression
pattern and gives detailed knowledge on the regulatory sequence for
that gene. DNA methylation level in plants changes during different
processes of plant growth and development and also when plants are
exposed to biotic and abiotic stresses [4]. While some of these changes
are transient, others are heritable through a process called transgenerational memory [5–7].
The DNA methylation pattern in Arabidopsis can be genetically manipulated via mutations in genes that maintain and/or are
involved in establishing de novo DNA methylation. These include
mutations in the DNA methyltransferase MET1 gene, the chromatin remodeling factor gene DDM1 (Decrease in DNA Methylation
1), and methylcytosine-binding protein 9 (AtMBD9) which all lead
to a significant alteration in genome-wide DNA methylation levels
and consequently to the reactivation of transcriptionally silent
genes and transposable elements [8, 9].
The DNA methylation pattern in Arabidopsis genome can also
be manipulated chemically. When chemical analogues of cytosine
are incorporated into genomic DNA during replication, they
inhibit catalytic activity of DNA methyltransferases by covalently
binding to their active sites which leads to a general reduction in
the DNA methylation level [10]. In plants, the most commonly
used cytidine analogue is 5-azacytidine (5-azaC), in which the ring
carbon 5 is replaced by nitrogen [10]. The chemical structure of
cytidine, 5-methylcytidine, 5-azaC, and zebularine is illustrated in
Fig. 1. 5-azaC induces hypomethylation and genome-wide transcriptional reactivation of silent genes and thus modifies plant
growth and development [11–13]. Zebularine is also a cytidine
analogue and inhibits DNA methylation in a similar way to 5-azaC.
Compared to 5-azaC, zebularine is more stable and less toxic
although the demethylation effect of zebularine is transient [14].
DNA Methylation Analysis Using MSAP
287
DNA methylation can be detected using sodium disulfide
sequencing, with proteins with an affinity to the methyl group,
with anti-methylcytosine antibodies, and using methylationsensitive restriction endonucleases [15]. Global DNA methylation
level can be detected using a southern blot hybridization technique. In this technique genomic DNA is digested with methylationsensitive endonucleases such as HpaII and the methylation-insensitive
isoschizomer MspI. Then the resulting DNA fragments are probed
with an abundantly available gene in the genome such as the
120 bp 5S ribosomal RNA repeat. Global DNA methylation can
also be studied using methyl-sensitive amplification polymorphism
(MSAP). In this technique, genomic DNA from different samples
is digested with the methylation-sensitive endonuclease HpaII,
and adaptors are ligated to this DNA, followed by fragment amplification using PCR with specific primers (Fig. 2). Qualitative and
quantitative differences in the amplification indicate variation in
the global DNA methylation pattern, and the significant variation
of methylation from site to site as well as from tissue to tissue can
be studied. In this chapter, we describe the experimental protocol
used to measure the global DNA methylation in methyltransferase
mutants including met1, ddm1, and atmbd9 mutants as well as in
wild-type Arabidopsis plants after treatment with 5-azaC or zebularine using the MSAP technique.
This MSAP technique can be used to identify deferentially
methylated genomic regions within and between populations of
plants of different genetic backgrounds as well as in plants grown
under different environmental conditions. In addition, this technique can be used for epigenetic mapping and positional cloning of
target genes. MSAP is described here according to the previously
published strategies and protocols designed for the amplified fragment polymorphism technique (AFLP) [16] and modified for the
MSAP by Beaulieu et al. [17] and Madlung et al. [18]. Although
MSAP is a reliable and easy to use technique, methods based on
methylation-sensitive digestion limit the detection of methylation
to the restriction sites of the endonuclease enzymes used.
2
Materials
2.1 Treatment of
Arabidopsis Seeds
with 5-azaC and
Zebularine
1. Arabidopsis seeds: The seeds of Arabidopsis thaliana ecotype
Columbia (Col) wild-type and met1, ddm1, atmbd9 mutants
can be obtained from the Arabidopsis Stock Center (TAIR;
www.arabidopsis.org).
2. Sterilization solution (5 % sodium hypochlorite, 0.05 %
Tween-20).
3. Ethanol 75 %.
4. 1 mm Whatman filter papers.
288
Mahmoud W. Yaish et al.
Fig. 2 Schematic representation of the MSAP technique. DNA is digested first with methylation-sensitive
(HpaII) and methylation-insensitive (MspI) endonucleases, then the resulting DNA fragments are ligated to
specific adaptors. Subsequently, the ligated DNA fragments are used as templates in a preselective PCR reaction using specific primers. The resulting PCR products are used as DNA template in a selective PCR reaction
using three selective nucleotides as fluorescently labeled primers (asterisk). The selective PCR products are
loaded into an ABI Prism 310 Genetic Analyzer machine. Bands are scored for presence or absent
DNA Methylation Analysis Using MSAP
289
5. DNA demethylation chemicals 5-azaC and zebularine are
available in Sigma. Preparing fresh 0.5 mM 5-azaC aqueous
solution for each treatment. Never use stored 5-azaC solution
(see Note 1). Prepare 40 mM zebularine stock solution in
sterile distilled water and store at −20 °C (see Note 2).
6. Preparing zebularine treatment medium: solid 0.5× MS
medium [19], 1 % sucrose, 1 % agar, 40 μM zebularine in Petri
dishes. Control medium is solid 0.5× MS medium without
zebularine.
7. Pots containing a mixture of universal substrate and vermiculite
(3:1 v/v).
2.2 Genomic DNA
Extraction and
Purification
1. Liquid nitrogen.
2. Mortar and pestle.
3. DNA extraction buffer (150 mM Tris–HCl, pH 8.0, 15 mM
EDTA (ethylenediaminetetraacetic acid), 1.0 M NaCl,
0.16 % (w/v) CTAB (cetyltrimethylammonium bromide),
20 μL/L 2-mercaptoethanol, and 0.1 % (w/v) PVP
(polyvinylpyrrolidone)).
4. Phenol/chloroform/isoamyl alcohol (PCIM, 25:24:1, v/v/v),
stored at 4 °C.
5. 100 % Isopropanol.
6. 75 % Ethanol.
7. Tris EDTA (TE) buffer (10 mM Tris–Cl, pH 7.5. 1 mM
EDTA).
8. Sodium acetate 3 M (pH 5.2).
9. QIAGEN DNeasy Plant Maxi Kit (Catalogue number 68163).
10. NanoDrop spectrometer.
11. Agarose gel electrophoresis unit.
2.3 Methyl-Sensitive
Amplification
Polymorphism (MSAP)
1. Restriction enzymes and their buffers (EcoRI, HpaII and
MspI).
2. T4 DNA ligase and ligase buffer.
3. Adapters: EcoRI adapter: (5′-CTCGTAGACTGCGTACC-3′)
and (5′-AATTGGTACGCAGTCTAC-3′). HpaII-MspI adapter:
(5′-GATCATGAGTCCTGCT-3′ and 5′-CGAGCAGGACTCA
TGA-3′). Primers should be HPLC purified and synthesized at
0.2 μM scale.
4. Oligonucleotide primers: preselective EcoRI oligonucleotide
primer (5′-GACTGCGTACCAATTC-3′), preselective oligonucleotide primer HpaII-MspI (5′-ATCATGAGTCCTGC
TCGG-3′), selective EcoRI oligonucleotide primer
(5′-GACTGCGTACCAATTC-AAC, ACC, ACA or AAG-3′)
(Applied Biosystems) (see Note 3), and HpaII-MspI selective
290
Mahmoud W. Yaish et al.
oligonucleotide primers (5′-ATCATGAGTCCTGCTCGGT
CAA-3′ and 5′-ATCATGAGTCCTGCTCGGTCCA-3′).
Primers should be HPLC purified and synthesized at 0.2 μM
scale.
5. Taq DNA polymerase, buffer, and dNTPs.
6. A thermocycler such as Perkin-Elmer GenAmp PCR System
9700.
7. Mini agarose gel electrophoresis unit.
8. 1× Tris Borate EDTA (TBE) buffer (89 mM Tris base, 89 mM
boric acid, 2 mM EDTA).
9. 100 Kb DNA GeneRuler ladder (Fermintus, catalogue number
SM0241).
10. 6× DNA loading dye (Fermintus, catalogue number R0611).
11. GeneScan-500 [ROX] internal size standard (Applied Biosystems,
catalogue number 401734).
12. Deionized formamide (Applied Biosystems, Catalogue number
400596).
13. ABI Prism 310 Genetic Analyzer (Applied Biosystems).
14. ABI Prism GeneScan 3.1 software.
2.4 Identification of
the Polymorphic DNA
1. Selective EcoRI oligonucleotide primers end labeled with
radioisotope (ATP [32P]) end-labeling grade from ICN
Radiochemicals, Solon, OH, USA.
2. 40 % Acrylamide solution (37:5:1 acrylamide-bis-acrylamide
solution) can be obtained from Bio-Rad Life Science.
3. 1 M Tris–HCL buffer (pH 8.0) can be obtained from Sigma
Aldrich.
4. 10 % ammonium persulfate (10 mg/mL) can be obtained
from Bio-Rad Life Science.
5. TEMED (N,N,N′,N′-tetramethylethylenediamine) can be
obtained from Bio-Rad Life Science.
6. TBE buffer (89 mM Tris base, 89 mM boric acid, 2 mM
EDTA).
7. 10 % Acetic acid.
8. Formamide loading dye: formamide dye (98 % formamide,
10 mM EDTA pH 8.0) and bromophenol blue and xylene
cyanol as tracking dyes.
9. Power supply.
10. Fuji BAS-2000 phosphoimage analysis system (Fuji Photo
Film Company Ltd, Japan).
11. QIAEX II Gel Extraction Kit (QIAGEN, Catalogue number
20021).
DNA Methylation Analysis Using MSAP
291
12. QIAquick PCR Purification Kit (QIAGEN, Catalogue number
28104).
13. Sequi-Gen 38 cm × 50 cm gel apparatus (Bio-Rad Laboratories
Inc., Hercules, CA, USA).
3
Methods
As a general precaution, in order to obtain a constant temperature
and time accuracy during the experiments, a thermocycler machine
should be used in the incubation steps during the digestion and
ligation.
3.1 Surface
Sterilization of
Arabidopsis Seeds
1. Suspend 100 mg seeds of Col or a mutant line in 1 mL 75 %
ethanol in an Eppendorf tube for 5 min.
2. Remove the ethanol solution and wash the seeds two times
with sterile distilled water.
3. Suspend the seeds in the sterilization solution for 5 min in an
Eppendorf tube with frequent mixing.
4. Remove sterilization solution from the tube, and wash the
seeds with sterile distilled water six times.
5. Stratify the surface-sterilized seeds by storing them in the dark
at 4 °C for 2 days. The seeds are ready for demethylation
treatment by 5-azaC and zebularine.
3.2 Treating
Arabidopsis Seeds
with 5-azaC
1. Wet 1 mm Whatman filter papers with 0.5 mM 5-azaC aqueous solution (2 mL/paper) or with sterilized distilled water as
the control.
2. Place the wetted filter papers in Petri dishes.
3. Sow the surface-sterilized Col seeds on the filter papers, and
wrap up the Petri dishes with parafilm to keep humidity.
4. Allow the seeds to germinate by placing the Petri dishes in the
dark at 4 °C for 6 days.
5. Plant the seedlings in pots filled with universal substrate and
vermiculite under the following growth conditions: 24 °C
(day)/20 °C (night), 16 h light/8 h dark, 200 μE light intensity, and 60 % humidity.
6. Record plant growth and development phenotype with and
without treatment.
3.3 Treating
Arabidopsis Seeds
with Zebularine
1. Sow surface-sterilized Col seeds on zebularine treatment
medium and control medium, respectively.
2. Incubate the seeds under the following environmental condition: 24 °C (day)/20 °C (night), 16 h light/8 h dark, 200 μE
light intensity.
292
Mahmoud W. Yaish et al.
3. At 14 days after seed germination, transfer the seedlings
growing to a freshly prepared media containers (see Note 4).
4. Record plant growth and development phenotype with and
without treatment.
3.4 Genomic DNA
Isolation
1. Collect rosette leaves from ten plants of each treatment, mutant
line, and wild-type Arabidopsis plants before flowering, freeze
in liquid nitrogen, mill to powder, and store at −80 °C
for DNA extraction and genome-wide analysis of DNA
methylation.
2. Add 3 mL of DNA extraction buffer to a 50-mL polypropylene
tube for each 1 g of fine grounded tissue.
3. Incubate at 65 °C in a water bath for 45 min with frequent
shaking then allow the extract to cool down to room
temperature.
4. Extract the homogenate with phenol/chloroform/isoamyl
alcohol (25:24:1).
5. Centrifuge at 10,000 × g for 10 min at room temperature and
transfer the aqueous layer to a new tube.
6. Extract again with chloroform/isoamyl alcohol (24:1).
7. Centrifuge at 10,000 × g for 10 min at room temperature.
8. Transfer the aqueous layer to a new tube and precipitate the
nucleic acids in the aqueous phase by adding 10 % volume of
sodium acetate 3 M (pH 5.2) and 60 % volume of cold isopropanol and incubated 2 h at −80 °C.
9. Centrifuge at 10,000 × g for 30 min at 4 °C and wash the
nucleic acids pellet with 1 mL of cold 75 % ethanol.
10. Dissolve the pellet in 300 μL of TE buffer.
11. Purify the extracted DNA from contaminants and enzyme
inhibitors using the QIAGEN DNeasy Plant Maxi Kit following the manufacturer’s instructions.
12. Determine the quantity and the quality of the DNA using the
NanoDrop spectrometer and run 10 μL in a 1 % agarose gel
(see Note 5).
3.5 Methyl-Sensitive
Amplified
Polymorphism (MSAP)
3.5.1 DNA Digestion,
Adaptor Ligation,
Preselective, and Selective
PCR Amplification
1. Digest genomic DNA (100 ng) of ten individual Arabidopsis
plants per treatment using 4 U each of EcoRI and either
methylation-sensitive HpaII or methylation-insensitive MspI
in a final volume of 10 μL using the thermocycler as an incubator for the reaction.
2. When the incubation time is finished, deactivate the digestion
enzymes by heating the reaction at 80 °C for 10 min.
3. Anneal the complementary oligonucleotides (EcoRI adapter
primers) and (HpaII-MspI adapter primers) in two different
DNA Methylation Analysis Using MSAP
293
tubes by adding 20 μL of 30 pmol from each complementary
primer in a 100 μL PCR tubes, heat up to 72 °C for 10 min,
and then allow the reaction to cool down to room temperature (see Note 6).
4. Ligate the digested genomic DNA fragments (10 μL) to the
two adapters by adding ligation mixture (2 μL of 1.5 pmol of
EcoRI adapter, 2 μL of 15 pmol of HpaII-MspI adapter, 4 U
of T4 DNA ligase, 1× ligase buffer) in a total volume of 25 μL
and incubate overnight at 18 °C.
5. Subsequently, dilute the ligation reaction four times using
H2O Milli-Q.
6. Use 3 μL of the diluted ligation reaction, 10 pmol of preselective EcoRI and HpaII-MspI primers, 0.2 mM of dNTPs, and
0.5 U of Taq DNA polymerase. Set the thermocycler using the
following conditions: 94 °C, 30 s; 56 °C, 1 min; 72 °C, 1 min
for 20 cycles of amplification.
7. Check the size of the amplified fragments by running 10 μL of
the PCR products using agarose gel electrophoresis 1.5 % in
1× TAE buffer at 4 V/cm for 3–4 h (see Note 7).
8. Stain with ethidium bromide (see Note 8).
9. View the gel on a UV transilluminator (see Note 9).
10. Dilute 10 μL of the PCR products ten times with H2O Milli-Q
and use the dilution as a template for the selective amplification.
11. Use 3 μL of the diluted PCR products, 0.5 pmol of one EcoRI
selective labeled primes and 10 pmol of one HpaII/MspI
selective primers, 0.04 mM dNTP, and 0.5 U Taq polymerase
in a 11 μL PCR reaction using a touchdown program of a
thermocycler using the following: 94 °C for 2 min and 20
cycles of 94 °C for 20 s, 66 °C for 30 s, 72 °C for 2 min. The
annealing temperature of the first ten cycles follows the
shutdown program in which each cycle falls by 1 °C. At the
end of these cycles, maintain the reaction at 60 °C for 30 min
to get better extension.
3.5.2 Separating the
PCR Products of Selective
Amplifications by Capillary
Electrophoresis on an ABI
Prism 310 Genetic
Analyzer
The ABI Prism 310 Genetic Analyzer is able to detect the fluorescence as the EcoRI site-specific primers are labeled with yellow
(NED), blue (FAM), or green (JOE) fluorescent dyes. Each selective primer can be labeled with one of the three florescent colors to
allow loading together three different reactions. An internal size
marker, GeneScan Rox-500 (35–500 bp) labeled with a red (ROX)
dye, should be added in order to determine the size of the separated
fragments.
1. Prepare a loading buffer for each sample by mixing 24.0 μL of
deionized formamide and 1.0 μL of GeneScan-500 [ROX]
size standard.
294
Mahmoud W. Yaish et al.
Fig. 3 A sample chromatogram of results obtained from the ABI Prism 310 Genetic Analyzer machine. While
the horizontal scale represents the molecular weight of the fragments, the vertical scale represents the quantity of the amplicon. Each peak represents an amplicon (a fragment of DNA produced during the selective PCR
amplification) (a–c). Differentially amplified and polymorphic peaks are indicated by arrow. Smaller peaks
indicate the presence of a heteromorphic allele in terms of DNA methylation status (b). Absence of peaks
indicated by arrows may represent genetic modulation of DNA methylation in Arabidopsis (c)
2. Add 25 μL of the loading buffer mix to a genetic analyzer
sample tube. One tube was used for each sample.
3. Add 2 μL of the selective amplified PCR products to the tube.
4. Heat the tubes to 95 °C for 3 min using a thermocycler
machine.
5. Then, snap chill the tubes on ice (see Note 10).
6. Using the ABI Prism 310 Genetic Analyzer machine, inject
each sample for 12 s, at 15 kV, and use 15 kV as running
voltage for 26 min (see Note 11).
3.5.3 Data Analysis
Genomic DNA of ten individual plants (ten replicates) is usually
treated and screened for each Arabidopsis genetic line and treatment. The DNA methylation deviation pattern from the wild-type
can be assured using these replicates which are represented as presence or absent of particular polymorphic DNA fragment (amplicon)
in every treatment using the same primer pair in the selective PCR
amplification. Quantitative amplification can indicate the presence
of a heteromorphic allele in terms of DNA methylation (Fig. 3).
DNA Methylation Analysis Using MSAP
295
Selectively amplified DNA fragment data can be collected by the
ABI Prism 310 and analyzed using the ABI Prism GeneScan 3.1
software which will size and quantify the detected fragments. The
same software can be used to compare the graphical representations of amplified fragments from all individual plants. A peak size
between 60 and 500 bp should be selected to study the polymorphic DNA fragments (peaks) between the two genetic lines (Fig. 3).
MSAP products can be scored as present (1) or absent (0) on the
chromatogram to create a binary matrix. The proportion of polymorphic peaks can be estimated as the ratio of the number of
polymorphic peaks to the total number of bands. This data can be
treated and arranged depending on the purpose of the study.
Partial methylation, due to differences in methylation status
between copies of the same locus, results in changes in product
intensity between genotypes and 5-azaC- and zebularine-treated
plants.
Once an interesting peak is identified based on the polymorphic
pattern in the chromatogram, the DNA fragment corresponding to
that peak can be amplified using the same primer pair, isolated and
sequenced by running the selective PCR products in a vertical denaturing 5 % polyacrylamide gel.
3.6 Identification of
the Polymorphic DNA
Fragment
1. Perform the selective PCR as mentioned above using the
preselective PCR products as a DNA template and the suitable
selective [32P-ATP] end-labeled EcoRI primer. Run the PCR
using the thermocycler and the same conditions as mentioned
above (see Note 12).
2. Prepare denaturing 5 % acrylamide gel by mixing 12.5 mL of
40 % of acrylamide-bis solution, 7.5 M urea in 50 mM TBE,
500 μL 10 % ammonium persulfate, and 100 μL TEMED
(see Note 13).
3. Cast the solution in a Sequi-Gen 38 cm × 50 cm gel apparatus
and allow the gel to solidify for 4 h.
4. Denature PCR samples by mixing 20 μL of formamide loading
dye with equal amount of PCR sample, heat at 90 °C for
3 min, and then quickly chill on ice for at least 2 min.
5. Wash the gel wells from unpolymerized polyacrylamide and
urea then load an equal amount of every sample in the well.
6. Run the gel electrophoresis using TBE buffer at constant
power, 110 W, for 2 h.
7. Fix the DNA in the gel for 30 min in 10 % acetic acid, dry it
on the glass plates, and expose it to Fuji phosphoimage screens
for 16 h. Fingerprint patterns can be visualized using a Fuji
BAS-2000 phosphoimage analysis system.
8. Isolate the polymorphic DNA by cutting the band from the gel.
9. Rehydrate the band by boiling in 100 μL H2O Milli-Q for
5 min.
296
Mahmoud W. Yaish et al.
10. Clean up the DNA fragment from the gel impurities using the
QIAEX II Gel Extraction Kit.
11. Use the purified fragment as a template for a PCR reaction
containing 5.0 μL of the eluted DNA, 10.0 pmol of selective
EcoRI primer, 10.0 pmol HpaII/MspI, PCR buffer containing MgCl2, 2.5 mM dNTP, and 1.0 U Taq polymerase. The
PCR cycle should be used as mentioned above for the selective
PCR reaction (see Note 14).
12. Purify the PCR reaction using QIAquick PCR Purification Kit
following the manufacturer’s instructions.
13. Sequence the PCR products by using the selective EcoRI
primer and the routine sequencing reaction and conditions.
3.6.1 Data Analysis
4
In order to identify the differentially methylated DNA fragments,
information obtained from the sequencing reaction can be used in
a BLAST search against the National Center of Biotechnology
databases searching for sequence similarity. The BLAST website is
available at http://blast.ncbi.nlm.nih.gov/Blast.cgi. A gene can
be identified based on the similarity between the sequence in the
database and the obtained one.
Notes
1. 5-azaC is white crystalline powder and soluble (50 mM) in
water. However, 5-azaC is unstable in aqueous solution and
sensitive to light and oxidation. Therefore, storing 5-azaC is
not recommended. Treatment of Arabidopsis seeds should use
freshly prepared 5-azaC solution kept in the dark and at low
temperature.
2. Zebularine is an off-white solid and soluble (100 mM) in
water. A zebularine aqueous solution is stable for up to
3 months at −20 °C.
3. The EcoRI site-specific primers can be labeled with yellow
(NED), blue (FAM), or green (JOE) fluorescent dyes to allow
one to load three different reactions simultaneously.
4. The demethylation effect of zebularine is transient. Growing
Arabidopsis seedlings on zebularine treatment medium and
transferring them to control medium can be used to find
zebularine transiently reduced Arabidopsis genomic DNA
methylation.
5. The DNA concentration can be measured using a NanoDrop
spectrophotometer adjusted to a wavelength of 260 nm. The
purity of the DNA is determined by measuring the absorbance
ratio 260/280 nm. A good quality DNA should have a ratio
between 1.8 and 2.0. Good quality DNA appears in the agarose
DNA Methylation Analysis Using MSAP
297
gel stained with ethidium bromide as a high molecular weight
sharp single DNA band. Bad quality DNA appears as several
DNA bands or a smear in the same gel. Smears in the gel indicate the presence of low molecular weight DNA which is due to
degradation during DNA extraction. This is not suitable for
MSAP analysis.
6. DNA digestions and adapter ligations should be carried out
separately to avoid the formation of a long continuous DNA
molecule that contains multiple copies of the same DNA
sequences linked together in series (concatemers).
7. The ligation step is a very critical step of this protocol. Preamplified PCR products should appear as a smear with equal
intensities between samples using agarose gel electrophoresis
and ethidium bromide staining.
8. Ethidium bromide is a mutagen chemical and is moderately
toxic. Apply extra cautions when you use it. Wear gloves, a lab
coat, and safety glasses when using this dye.
9. Good amplification products for MSAP should appear as a
smear of molecular weight between 100 and 1,500 bp in a
1.5 % agarose gel.
10. The genetic analyzer sample tubes can be placed in the 48- or
96-well sample try.
11. To verify the reproducibility of each fragment, each MSAP
procedure should be repeated at least twice.
12. PCR labeling of the DNA fragment, excision of the DNA
fragment from the chromatogram, and purification of the
radio-labeled PCR should be carried out behind 3/8 or 1/2
inch-thickness glass or transparent acrylic plates.
13. Unpolymerized acrylamide and TEMED should be handled
carefully because they are widely considered as neurotoxic and
reproductive toxic materials, respectively.
14. Often, the eluted amount of DNA is not enough to be used in
the sequencing reactions, therefore PCR is used to amplify
and increase the original amount of eluted DNA.
References
1. Ehrlich M, Wang RY (1981) 5-Methylcytosine
in eukaryotic DNA. Science 212:1350–1357
2. Doerfler W (1983) DNA methylation and
gene activity. Annu Rev Biochem 52:93–124
3. Riggs AD, Jones PA (1983) 5-Methylcytosine,
gene regulation, and cancer. Adv Cancer Res
40:1–30
4. Yaish MW, Colasanti J, Rothstein SJ (2011)
The role of epigenetic processes in controlling
flowering time in plants exposed to stress.
J Exp Bot 62:3727–3735
5. Boyko A et al (2010) Transgenerational adaptation of Arabidopsis to stress requires DNA
methylation and the function of Dicer-like
proteins. PLoS One 5:e9514
6. Chan SW, Henderson IR, Jacobsen SE (2005)
Gardening the genome: DNA methylation in
Arabidopsis thaliana. Nat Rev Genet 6:351–360
298
Mahmoud W. Yaish et al.
7. Molinier J et al (2006) Transgeneration memory
of stress in plants. Nature 442:1046–1049
8. Bartee L, Bender J (2001) Two Arabidopsis
methylation-deficiency mutations confer only
partial effects on a methylated endogenous
gene family. Nucleic Acids Res 29:2127–2134
9. Singer T, Yordan C, Martienssen RA (2001)
Robertson’s mutator transposons in A. thaliana
are regulated by the chromatin-remodeling
gene decrease in DNA methylation (DDM1).
Genes Dev 15:591–602
10. Santi DV, Garrett CE, Barr PJ (1983) On the
mechanism of inhibition of DNA-cytosine
methyltransferases by cytosine analogs. Cell
33:9–10
11. Yaish MW, Peng M, Rothstein SJ (2009)
AtMBD9 modulates Arabidopsis development
through the dual epigenetic pathways of DNA
methylation and histone acetylation. Plant J
59:123–135
12. Borowska N, Idziak D, Hasterok R (2011)
DNA methylation patterns of Brachypodium
distachyon chromosomes and their alteration
by 5-azacytidine treatment. Chromosome Res
19:955–967
13. Castilho A et al (1999) 5-Methylcytosine
distribution and genome organization in
triticale before and after treatment with 5azacytidine. J Cell Sci 112(Pt 23):4397–4404
14. Cheng JC et al (2003) Inhibition of DNA
methylation and reactivation of silenced genes
by zebularine. J Natl Cancer Inst 95:399–409
15. Zilberman D, Henikoff S (2007) Genome-wide
analysis of DNA methylation patterns.
Development 134:3959–3965
16. Vos P et al (1995) AFLP: a new technique for
DNA fingerprinting. Nucleic Acids Res 23:
4407–4414
17. Beaulieu J, Jean M, Belzile F (2009) The allotetraploid Arabidopsis thaliana–Arabidopsis
lyrata subsp. petraea as an alternative model
system for the study of polyploidy in plants.
Mol Genet Genomics 281:421–435
18. Madlung A et al (2002) Remodeling of DNA
methylation and phenotypic and transcriptional
changes in synthetic Arabidopsis allotetraploids.
Plant Physiol 129:733–746
19. Murashige T, Skoog F (1962) A revised medium
for rapid growth and bio assays with tobacco
tissue cultures. Physiol Plant 15:473–497
Part IV
Molecular Biological Techniques
Chapter 17
Next-Generation Mapping of Genetic Mutations Using Bulk
Population Sequencing
Ryan S. Austin, Steven P. Chatfield, Darrell Desveaux,
and David S. Guttman
Abstract
Next-generation sequencing platforms have made it possible to very rapidly map genetic mutations in
Arabidopsis using whole-genome resequencing against pooled members of an F2 mapping population.
In the case of recessive mutations, all individuals expressing the phenotype will be homozygous for the
mutant genome at the locus responsible for the phenotype, while all other loci segregate roughly equally
for both parental lines due to recombination. Importantly, genomic regions flanking the recessive mutation will be in linkage disequilibrium and therefore also be homozygous due to genetic hitchhiking. This
information can be exploited to quickly and effectively identify the causal mutation. To this end, sequence
data generated from members of the pooled population exhibiting the mutant phenotype are first aligned
to the reference genome. Polymorphisms between the mutant and mapping line are then identified and
used to determine the homozygous, nonrecombinant region harboring the mutation. Polymorphisms in
the identified region are filtered to provide a short list of markers potentially responsible for the phenotype
of interest, which is followed by validation at the bench. Although the focus of recent studies has been on
the mapping of point mutations exhibiting recessive phenotypes, the techniques employed can be extended
to incorporate more complicated scenarios such as dominant mutations and those caused by insertions or
deletions in genomic sequence. This chapter describes detailed procedures for performing next-generation
mapping against an Arabidopsis mutant and discusses how different mutations might be approached.
Key words Mutagenesis, Genetic mapping, Positional/mapped-based cloning, Genome sequencing,
Next-generation genomics, Genome analysis
1
Introduction
The physical mapping of monogenic, qualitative traits has traditionally
been a laborious and time-consuming task due to the necessity of
breeding and phenotyping large populations of F2 plants and their
subsequent molecular scoring. The advent of next-generation
sequencing (NGS) technologies has dramatically reduced this effort
in a number of model systems, including Arabidopsis, by replacing the
scoring of molecular markers with whole-genome sequencing [1–8].
Jose J. Sanchez-Serrano and Julio Salinas (eds.), Arabidopsis Protocols, Methods in Molecular Biology, vol. 1062,
DOI 10.1007/978-1-62703-580-4_17, © Springer Science+Business Media New York 2014
301
302
Ryan S. Austin et al.
To date, several groups have developed powerful NGS mapping
approaches for Arabidopsis, typically focused on identifying the position of recessive ethyl-methanesulfonate (EMS)-induced mutations
[1–3]. All of these methods employ an approach analogous to a bulksegregant analysis [9]. Namely, they exploit the genetic principle that
when a line carrying a recessive mutation (the mutant line) is crossed
to a mapping line to form an F1, which is then selfed to form a population of F2 plants segregating for the recessive trait of interest, all
plants possessing the target phenotype will be homozygous at the
causative mutation [10]. Moreover, the causative mutation will be in
linkage disequilibrium with the surrounding genome due to genetic
hitchhiking, and consequently the mutation of interest will be embedded in a larger homozygous block of the mutant genome. The extent
of this disequilibrium, in terms of how far it is maintained as you move
out from the mutation of interest, will be determined by the amount
of recombination between the two parental lines, which is directly
related to the number of individual F2 lines examined. Consequently,
distant genomic regions and those on different chromosomes will be
segregating with approximately an equal mix of the two parental lines
[9]. NGS can be performed on a pool of F2 lines to identify nearly all
mutations that distinguish the mutant and mapping line. These
sequence data are typically mapped back onto a reference genome to
identify those genomic regions that carry SNPs diagnostic of both
parents verses SNPs that are unique to the mutant line. Since SNPs
are identified in a de novo manner from the sequence data, the causal
mutation can be found directly within the sequence results. This is of
course dependent upon a sufficient number of recombination events
surrounding the target locus, adequate sequence quality and a sufficient depth of coverage across the genome for calling SNPs with reasonable confidence. However, when these conditions are met,
mapping software and tools are able to quickly identify a short list of
candidate genes responsible for the phenotype [1–3].
Although, several approaches and tools for performing
mapping by NGS in Arabidopsis have been made available [1–3],
this protocol will focus on the “next-generation mapping” (NGM)
implementation [2] (http://bar.utoronto.ca/ngm). This method
classifies SNP allelic frequencies as arising either from the homozygous (mutant) or heterozygous (mutant and mapping) backgrounds using a purity statistic and applies a technique based on
kernel density estimation to refine the region of interest [2].
A user-friendly, web-based interface allows the researcher to
dynamically explore their mapping result and requires only a file
detailing SNPs present within the bulked population. This file is
generated from sequencing the F2 bulk population on any suitable
NGS platform, aligning reads to the reference genome and calling
SNPs using freely available public-domain software.
Although mapping-by-NGS applications in Arabidopsis to
date have mostly focused on recessive point mutations generated
Next-Generation Mapping
303
from EMS screens in a Columbia reference background, the
method can be extended to map mutations arising from different
ecotype backgrounds [3], indels, suppressor/enhancer screens, or
those having dominant phenotypes. In the case of dominant mutations, mapping is possible by carrying individual F2 lines through
to F3. Frozen F2 tissue can then be bulked based on whether F3
progeny no longer segregate for the background phenotype. This
will ensure that all F2 are homozygous at the dominant loci and
mapping proceeds in the same manner as for a recessive trait.
Similarly, indels can be approached in a manner analogous to SNP
identification. Called SNPs, generated as de novo markers, would
still be used to perform the virtual bulk segregation and first identify the nonrecombinant region. A candidate list of genes could
then be created by pulling indels from within that region using
NGS software and filtering them based on their effect on coding
sequence and level of homogeneity among all reads mapped at that
loci. In more complicated mappings, such as suppressor screens
that produce a recessive phenotype, the mapping can still be
approached as a typical recessive mapping. This should produce an
expected result of more than one region of nonrecombination in
the genome. However, if the mutation has linkage with or lies on
the same chromosome as the background target loci or the nature
of the screen itself is inherently complex, producing epistatic effects,
for example, then mappings may be very difficult or unsuccessful.
Certainly, as NGS mapping continues to develop and eventually
replace traditional mapping, the tools and techniques available will
accommodate increasingly complicated mapping scenarios.
2
Materials
2.1 Tissue
Generation
1. An M2 EMS mutant line of Arabidopsis carrying a recessive
mutation resulting in an interesting phenotype (see Note 1).
2. A mapping line of Arabidopsis different from the reference
ecotype.
3. Equipment for crossing:
(a) Dissection microscope or magnifying lens/headgear.
(b) Fine forceps.
4. Materials and growth conditions to manifest/distinguish
mutant phenotype.
5. Equipment to harvest and store tissues from selected F2 plants:
(a) Fine scissors/forceps.
(b) Microcentrifuge tubes or aluminum foil.
(c) Liquid nitrogen and −80 °C freezer.
304
2.2
Ryan S. Austin et al.
Bulk Sequencing
1. Mortar, pestle, and liquid nitrogen.
2. Plant genomic DNA extraction kit.
3. Standard molecular biology laboratory.
2.3 Next-Generation
Mapping
1. A computer running a distribution of the Linux operating
system.
2. Software for mapping sequence reads to a reference (e.g., BWA,
Bowtie) [11, 12].
3. A compatible version of SAMtools (see Note 2) [13].
4. A web browser with Java Runtime Environment 1.5 or higher
enabled.
5. A reference sequence for the Arabidopsis genome in FASTA
format (see Note 3).
3
Methods
3.1 Generating a
Mapping Population
1. Grow mutant and mapping lines for synchronous flowering.
2. Cross mutant and mapping lines (see Note 4):
(a) Using tweezers and a dissecting microscope, remove
opened flowers and young buds on a mutant plant, and
then emasculate 1–3 late-stage unopened flower buds.
(b) Apply pollen from an opened flower of the mapping line
donor to the receptive stigma of an emasculated bud of the
mutant line.
(c) Label the cross and cut out the apical meristem to prevent
further flowers forming.
(d) Harvest the resulting silique as it browns, but before it
dehisces.
3. For even germination, allow 1–2 weeks for F1 seed to fully dry
and mature before sowing. If the cross was successful none of
the F1 should show the mutant phenotype.
4. Grow the F1 plants to harvest F2 seed.
5. Sow the F2, grow, and phenotype (see Note 5).
6. Harvest equal quantities of tissue from 50 to 100 F2 plants
exhibiting the phenotype (see Note 6) and flash freeze in
liquid nitrogen (see Note 7).
3.2 DNA Extraction
and Preparation for
Sequencing
1. Grind pooled tissue in liquid nitrogen and extract genomic
DNA (see Notes 8 and 9).
2. Send the genomic DNA sample for sequencing on a NGS
platform (see Note 10).
Next-Generation Mapping
3.3 Reference
Mapping and
Polymorphism Calling
305
Mapping your sequence reads to a genomic reference can be
accomplished using any software that produces an output file in the
NGS standard SAM/BAM (Sequence Alignment Map) file format
(see Note 11). Most NGS mapping programs are run on the command line in a UNIX environment (see Note 12). We present several
examples of how this would be accomplished in this section.
However, the examples provided, while completely sufficient, involve
third-party software under active development. Thus, particular
commands may change with successive revisions, and examples are
intended as a rough outline for how mapping to a reference genome
is accomplished using two popular programs. Likewise, the optimal
parameters supplied may vary depending on the nature of the
sequence data employed. The assumptions are made that the software programs mentioned are properly installed on the user’s
computer and that all sequence reads are concatenated into a single
file in FASTQ format (see Notes 13 and 14).
1. Obtain your sequence reads in the standard FASTQ format
from your sequencing center.
2. Download the reference genome for Arabidopsis (e.g., TAIR10)
(see Note 3).
3. Map the sequence reads to the reference genome using a
next-generation mapping tool that can return SAM/BAM
data. This is exampled below using two different popular
tools, BWA v0.5.8c [11] and Bowtie v0.12.7 [12]. The “$” in
front of each command represents the command prompt and
should not be typed (see Note 15).
Example A: Using Bowtie against a single-read data file reads.fastq
(a) Generate an index for the Arabidopsis genome:
$ bowtie-build TAIR10_chr_all.fas TAIR10
(b) Align the reads to the reference and put to a SAM file:
$ bowtie -S TAIR10 reads.fastq alignment.sam
(c) Convert the SAM file to BAM file:
$ samtools view –bS –o alignment.bam alignment.sam
Example B: Using BWA against a single-read data file reads.
fastq
(a) Generate an index for the Arabidopsis genome:
$ bwa index TAIR10_chr_all.fas
(b) Align the reads to the reference genome and put to a
temporary alignment file:
$ bwa aln TAIR10_chr_all.fas reads.fastq > reads.sai
306
Ryan S. Austin et al.
(c) Generate the map file and put to compressed SAM
(see Note 16):
$ bwa samse TAIR10_chr_all.fas reads.sai reads.fastq |
gzip > result.sam.gz
(d) Sort the results and create a BAM output file, alignment.
bam (see Note 17):
$ samtools view –bt TAIR10_chr_all.fas result.sam.gz |
samtools sort – alignment
Example C: Using BWA against paired-end data files read1.
fastq and read2.fastq
(a) Generate an index for the Arabidopsis genome:
$ bwa index TAIR10_chr_all.fas
(b) Align each collection of read pairs to the reference genome
and put to a temporary file:
$ bwa aln TAIR10_chr_all.fas reads1.fastq > reads1.sai
$ bwa aln TAIR10_chr_all.fas reads2.fastq > reads2.sai
(c) Generate the alignment file, pairing reads together to find
best mapping positions and compress SAM output:
$ bwa sampe TAIR10_chr_all.fas reads1.sai reads2.sai
reads1.fastq reads2.fastq | gzip > result.sam.gz
(d) Sort the results and create an output file, alignment.bam:
$ samtools view –bt TAIR10_chr_all.fas result.sam.gz |
samtools sort – alignment
4. Using SAMtools v0.16 or earlier, take the BAM file output
from your mapping procedure and generate a “pileup” file
detailing polymorphism information using the below command (see Note 2).
$ samtools pileup -vcf reference.fasta alignment.bam > out.
pileup
3.4 Next-Generation
Mapping
1. Connect to the next-generation mapping (NGM) server at
the Bio-Array Resource, University of Toronto (http://bar.
utoronto.ca/ngm).
2. Click “Start the Applet” and agree to the security dialogue
that pops up (see Notes 18 and 19).
3. Select the “SAM” tab if necessary and click “Select SAM file.”
Provide the “pileup” file that you created in step 4 above
using SAMtools.
Next-Generation Mapping
307
4. Click on “Select Output File” and choose a name for the output
file. This will be given the extension of “.emap” and be created
on your local computer.
5. Click the “Start Processing” button.
6. When the applet finishes, scroll down to the Upload Data
section and click the “Choose File” button. Browse to the
“emap” file you just created and select.
7. Make sure the “Filter SNP data by quality criteria” radio box is
checked and click on “Upload and analyze” to begin mapping
(see Note 20).
8. The next “Map to chromosome” screen will present a histogram for each chromosome in the Arabidopsis genome and
the frequency of SNPs occurring along the length of each
chromosome, binned at 250 kb intervals (Fig. 1). You can
adjust the interval size at the top of the page by entering a new
bin size and selecting “Update histogram”; however, this is
rarely needed. Select the chromosome that possesses a region
depleted in SNPs by clicking the radio button under the chromosome identifier. Then click “Submit” (see Note 21).
9. The “Chastity belt partitioning” screen presents a default
selection of parameters and an initial attempt at localizing the
mutation (Fig. 2).
10. Click on the “Show detailed view” button at the top of the
screen. This will repeat the chastity separation process at a
variety of parameter values and present the results. Begin at
the left of the page where k = 5 and examine the ratios downward for each kernel size. Identify the ratio that provides a
distinctive peak using the smallest value for “k” and largest
“kernel” value possible. Click on the selection button to the
left of the best ratio (see Note 23).
11. Adjust the red guide bars to flank a region on either side of the
peak identified. A list of potential candidate SNPs will appear
under the SNP annotations section at the bottom of the
screen. Adjust the guide bars to encompass a generous region
around the identified peak.
12. Clear the “Filter SNP data by quality criteria” radio button and
click “Update quality filter.” Also, clear the checkboxes for
removing transversions and non-CDS mutations (see Note 24).
13. Examine the BLOSUM score for any non-synonymous substitutions. The larger the score, the more disruptive the amino
acid substitution is to the coding sequence (see Note 25).
14. Use the above information to formulate a prioritized short list
of candidate genes for validation at the bench (see Note 26).
Fig. 1 Genome-wide natural variation patterns. Histograms of the highly reproducible frequency of SNPs found genome wide between the Columbia-0 (Col-0) and
Landsberg erecta (Ler) accessions (left; 250 kb bins). Nonrecombinant region examples for each of the five Arabidopsis chromosomes (right). In each example, all other
chromosomes would exhibit the default pattern of natural variation seen in the left panel. A vertical black dash marks the position of the causal mutation found in each
case (see Note 22)
Fig. 2 Chastity belt partitioning. 80 different “chastity threads” are smoothed estimations of SNP frequency
along the chromosome length for SNPs possessing discordant chastity scores within discretely defined
intervals (top panel) [2]. Smoothing is adjusted using the “kernel” parameter. Colors correspond to “k” different
clusters of similarity among threads as grouped by k-means clustering. Threads in the top panel that fall within
clusters containing allele frequency values corresponding to homozygous frequency (i.e., discordant chastity = 1) and heterozygous frequency (i.e., discordant chastity = 0.5) are presented in the second panel. The
ratio of the chastity belts in the second panel is used to localize the mutation (middle panel). Additional ratios
created by repeating the smoothing process using smaller “kernel” values are presented in the bottom two
panels
310
4
Ryan S. Austin et al.
Notes
1. As these protocols were initially developed for relatively simple
cross designs, in complicated suppressor/enhancer screens or the
like, the genomic structure could be complicated with epistatic
domains and other features that make mapping very difficult.
Additionally, extensive backcrossing is not recommended, as in
our experience mappings are more successful without. While
mapping by NGS can be applied to any two ecotypes of
Arabidopsis, as the physical mutation is identified through comparison to a reference sequence, it is important that the reference
genome used corresponds to the mutagenized line.
2. In order to process SNPs directly from the industry standard
Variant Call Format (VCF) (such as created by the SAMtools
‘mpileup’ function), users should download and run the Perl
script, BCF2NGM.pl, as provided on the NGM website, against
their VCF file before uploading the result to the NGM server.
In this case, the user should specify that SNPs are not to be filtered by NGM by unchecking the filter SNPs checkbox when
uploading data to the server. In order to make use of the JAVA
applet for preprocessing SAMtools ‘pileup’ data, users must use
a version of SAMtools (i.e. < 0.1.16) that uses the now deprecated ‘pileup’ function as available here: http://sourceforge.
net/projects/samtools/files/samtools/.
3. As of print, the TAIR10 genomic reference could be obtained at:
ftp://ftp.arabidopsis.org/Genes/TAIR10_genome_release/
TAIR10_chromosome_files/TAIR10_chr_all.fas.
4. For a guide to crossing Arabidopsis, see http://arabidopsis.
info/InfoPages?template=crossing;web_section=arabidopsis.
5. The success of a reciprocal cross, with pollen from a recessive
mutant applied to the mapping line, will not be revealed until
homozygotes for the mutant allele segregate in the F2.
However, a successful cross of a dominant allele from the
mutant will be revealed in the F1. The success of any cross can
also be confirmed in the F2 using PCR with primers differentiating between SNPs at several unlinked positions.
6. The selection of too many F2s for sequencing (e.g., >200) is
suspected to be detrimental to NGM. As the size of the nonrecombinant block surrounding the mutation of interest will
be proportional to the number of F2 lines used, too many
lines could excessively narrow the region of homozygosity
surrounding the mutation and obfuscate discovery.
7. Intact seedlings can be harvested for this purpose or individual
leaves/leaf punches at later stages of development. An easy
way to obtain consistent leaf samples is to use a clean hole
punch or close the cap of a microfuge tube on a leaf blade to
Next-Generation Mapping
311
similar effect. Duplicate pools of tissue insure against the need
to regrow and test F2 plants if subsequent steps fail.
8. Since only small quantities of tissue are required for each prep
(10–30 mg), multiple genomic preps can usually be produced
from each tissue pool, with the option to combine them if
individual yields do not meet expectations. High yields of
genomic DNA can be obtained using Gentra Puregene kits
(Qiagen), but the duration of some incubation steps (e.g., cell
lysis and RNAse incubations) may need to be optimized
depending on the tissues used to produce preps of sufficient
purity. Very clean genomic samples can be obtained using a
column-based extraction method such as the DNeasy Plant
Mini kit (Qiagen). The step recommended to minimize
genomic shearing should be used, and additional steps can be
employed to enhance eluted DNA concentration depending
on the requirements of the sequencing platform. Incubating
the elution buffer on the column for a limited period prior to
centrifuging, reusing the first eluate in the same column, or to
elute multiple columns can help ensure the genomic sample is
sufficiently concentrated. Generally speaking the columnbased method can help ensure high-enough purity when
extracting from samples of older or recalcitrant tissues.
9. It is important that sufficient measures are taken to avoid
contamination of the final genomic sample with DNA from
either wild-type line and to minimize the presence of falsepositive F2s. Such problems can dilute the EMS and natural
variation polymorphism signals to a level that makes mutant
identification and even visualization of the nonrecombinant
region difficult or impossible.
10. We generally find that 50–80 million clusters generated by a
paired-end protocol with 40 bp read lengths (approximately
2–4 Gb) are sufficient data for mapping. This should provide
15–30× depth coverage of the ~120 Mb Arabidopsis genome.
Of course, as would be expected, mappings with high-quality,
high-coverage (+50×) sequence have produced excellent results.
11. A BAM file (Binary Alignment Map) is a compressed SAM file
(Sequence Alignment Map). The SAM format has become an
industry standard for representing sequence alignment data,
much like FASTA or FASTQ that are standards for representing
sequence data. The SAM Format Specification (v1.4-r985) can
be found here: http://samtools.sourceforge.net/SAM1.pdf.
12. It is assumed that the user is familiar with running programs
from the command line in a UNIX terminal. In cases where a
Linux server is unavailable, using the “Terminal” application in
Apple’s OS-X may suffice, provided sufficient memory and
CPU power are available. Many good books and online
references have been written on using the command line in a
312
Ryan S. Austin et al.
UNIX-based operating system such as Linux or OS-X. A Google
search for “UNIX primer” would be a good place to start.
13. FASTQ (or FASTA with quality) is a sequence representation
format similar to FASTA that includes an additional line of
quality information encoding an error probability for each
base pair in the sequence.
14. It may frequently be less than optimal to map all sequence
reads using a single large FASTQ file. This approach may not
properly utilize all available computational resources (e.g., by
distributing the task by mapping using many smaller FASTQ
files and merging the results). Nevertheless, it is presented as
such for the sake of brevity. It may benefit computer resources
to apply these tools in a distributive manner against many
smaller files containing subsets of sequence reads. Moreover,
large data sets, such as from a lane of HiSeq, can crash the
mapping tools with too much data.
15. In the commands listed, the use of “>” is a redirect operator
that directs the output of the program to a file, while the “|”
is a pipe operator which directs the output of the program to
another program. Switches are parameters that control program execution and are preceded by a dash “-” followed by a
letter indicator. See individual program documentation for
information on available switches.
16. Output in the examples is compressed and put to result.sam.
gz rather than a file with the conventional BAM extension
(as BAM is a compressed SAM file) (i.e., result.bam) so that
the file is not clobbered in the next command that sorts and
creates the actual result.bam file to be used.
17. The use of a “-” in the “samtools sort” command in this line
tells SAMtools that data is provided from another program
using the “|” operator. The word “alignment” is a userprovided prefix for the BAM file to be generated by SAMtools.
18. An applet has been built into NGM to allow the processing of
files that are potentially too large to transfer over a network.
The applet simply calculates a statistic based on allele composition and appends it the pileup file before trimming and
compressing the data. Future revisions of NGM may eliminate
the applet in favor of uploading a single VCF file, such as can
be generated by the “samtools mpileup” command.
19. If difficulties are experienced using the Java applet, a Perl script
can be downloaded from the NGM site and used to process
SAMtools output instead. To run the Perl script against your
SAMtools output (e.g., output.pileup), download the Perl
script SAM2NGM.pl, and ensure it is executable with
“chmod + x SAM2NGM.pl” and run:
Next-Generation Mapping
313
$ SAM2NGM.pl output.pileup.
This will create the file output.pileup.emap for upload to
the NGM server using the “Choose File” button.
20. While NGM is typically very robust against extraneous
false-positive SNP calls, it is important to filter your SNPs for
the identification of the nonrecombinant region. In circumstances where the nonrecombinant region is difficult to
identify, application of aggressive filtering at this stage may
help. Also, it is not recommended to filter your SNPs prior to
their provision to NGM. Although NGM will filter the SNPs,
it stores them in memory and allows them to be considered in
the final stage of mapping to account for circumstance involving poor quality data or low coverage that may exclude the
actual causal mutation from the initial analysis. In scenarios
with low or overly abundant sequence data, one may want
to adjust the min/max depth parameters for SNP calling.
Similarly, adjustments to quality scores can be tweaked to
slacken or increase SNP pre-filtering.
21. In situations where the nonrecombinant region is not readily
apparent, you may compare the histograms obtained against
the default-expected natural variation histograms between
Columbia-0 (Col-0) and Landsberg erecta (Ler) displayed in
Fig. 1 or examine histograms returned by the three NGM
examples provided on the NGM home page. The patterns of
natural variation observed across each chromosome are highly
reproducible.
22. In relation to the various example nonrecombinant regions in
Fig. 1: Chromosome 1 provides an ideal scenario of good
recombination rates on either side of the nonrecombinant
region. Peak identification for this mutant is shown in Fig. 2.
Chromosome 2 examples a large-scale drop in recombination
towards the tail end, while chromosome 3 (and chromosome 1)
illustrates the parabolic recombination pattern in SNP frequency that is frequently found. Chromosome 4 examples a
complicated mapping scenario in which recombination
dropped considerably across the chromosome, and chromosome 5 examples a scenario where poor sequence quality and
the presence of false-positive F2s in the bulked population
obfuscated the identification of the nonrecombinant region.
23. It is important not to aggressively choose a very small kernel
right away as the smaller the kernel size chosen, the more sparse
the data that is incorporated in the kernel density estimation.
This can result in artifact effects that appear as “peak shifts.” If
a peak exists at a larger kernel size and is shifted away from its
original position when a smaller kernel is employed, this result
314
Ryan S. Austin et al.
should be disregarded as an artifact. The rule of thumb is to
use the largest kernel size possible with the smallest cluster size
(k > 3) in order to return a distinctive peak. In some cases,
chastity belt partitioning can fail to return a distinctive peak.
If this is the case but the initial SNP histogram possessed a
distinctive nonrecombinant region, it is advised that the user
select a generous region surrounding the nonrecombinant
region identified in the histogram. When this region exhibits a
parabolic pattern, the causal mutation is typically found near
the bottom of the parabola.
24. These measures allow for a generous inclusion of all SNPs
occurring within the targeted region. Removing the filters is
useful in cases where poor quality data may have been removed
but correspond to false-negative SNPs. Transversions may
occur in ~1 % of EMS mutations and may be informative in
rare instances. Similarly, non-CDS mutations in the form of
cryptic splice sites are also possible and will be annotated as
such by NGM.
25. The BLOSUM 100 score provides a measure of effect the
amino acid substitution will have, with larger numbers having
a more adverse effect. Also, the discordant chastity should ideally be as close to 1.0 (100 %) as possible. However, it will be
lowered by false-positive F2s as well as sequencing and mapping errors. The default value of 0.85 is conservative and usually sufficient with values >0.95 commonly seen around the
causal mutation.
26. In cases with abundant candidate genes, it is advised that the
researcher performs a multiple sequence alignment using
protein sequence from several orthologs of the target gene
pulled from various plant relatives. Genes can then be ranked
for priority based on whether position at which the amino acid
substitution occurs in the candidate gene is conserved among
plant species and more likely to have a phenotypic effect.
Acknowledgments
The authors thank Peter McCourt, Nicholas J. Provart, Pauline W.
Wang, Danielle Vidaurre, George Stamatiou, Robert Breit, Dario
Bonetta, Jianfeng Zhang, Pauline Fung, and Yunchen Gong for
their help in the development of NGM. We would also like to
express our gratitude to the McCourt and Desveaux Labs
(University of Toronto), Haughan Lab (University of British
Columbia), and Bonnetta Lab (University of Ontario Institute of
Technology) for their provision of sequence data. This work was
funded through grants by the Natural Sciences and Engineering
Research Council of Canada to D.S. Guttman and D. Desveaux.
Next-Generation Mapping
315
References
1. Schneeberger K et al (2009) SHOREmap:
simultaneous mapping and mutation identification by deep sequencing. Nat Methods
6:550–551
2. Austin RS et al (2011) Next-generation mapping of Arabidopsis genes. Plant J 67:715–725
3. Uchida N et al (2011) Identification of EMSinduced causal mutations in a non-reference
Arabidopsis thaliana accession by whole genome
sequencing. Plant Cell Physiol 52:716–722
4. Sarin S et al (2010) Analysis of multiple ethyl
methanesulfonate-mutagenized Caenorhabditis
elegans strains by whole-genome sequencing.
Genetics 185:417–430
5. Blumenstiel JP et al (2009) Identification of
EMS-induced mutations in Drosophila melanogaster by whole-genome sequencing. Genetics
182:25–32
6. Smith DR et al (2008) Rapid whole-genome
mutational profiling using next-generation
sequencing technologies. Genome Res
18:1638–1642
7. Zuryn S et al (2010) A strategy for direct mapping and identification of mutations by wholegenome sequencing. Genetics 186:427–430
8. Irvine DV et al (2009) Mapping epigenetic
mutations in fission yeast using whole-genome
next-generation sequencing. Genome Res
19:1077–1083
9. Michelmore RW, Paran I, Kesseli RV (1991)
Identification of markers linked to diseaseresistance genes by bulked segregant analysis: a
rapid method to detect markers in specific
genomic regions by using segregating
populations. Proc Natl Acad Sci U S A
88:9828–9832
10. Lister R, Gregory B, Ecker J (2009) Next is
now: new technologies for sequencing of
genomes, transcriptomes, and beyond. Curr
Opin Plant Biol 12:107–118
11. Li H, Durbin R (2009) Fast and accurate short
read alignment with Burrows-Wheeler transform. Bioinformatics 25:1754–1760
12. Langmead B et al (2009) Ultrafast and
memory-efficient alignment of short DNA
sequences to the human genome. Genome
Biol 10:R25
13. Li H et al (2009) The sequence alignment/
map format and SAMtools. Bioinformatics
25:2078–2079
Chapter 18
Chemical Fingerprinting of Arabidopsis Using Fourier
Transform Infrared (FT-IR) Spectroscopic Approaches
András Gorzsás and Björn Sundberg
Abstract
Fourier transform infrared (FT-IR) spectroscopy is a fast, sensitive, inexpensive, and nondestructive
technique for chemical profiling of plant materials. In this chapter we discuss the instrumental setup, the
basic principles of analysis, and the possibilities for and limitations of obtaining qualitative and semiquantitative information by FT-IR spectroscopy. We provide detailed protocols for four fully customizable
techniques: (1) Diffuse Reflectance Infrared Fourier Transform Spectroscopy (DRIFTS): a sensitive and
high-throughput technique for powders; (2) attenuated total reflectance (ATR) spectroscopy: a technique
that requires no sample preparation and can be used for solid samples as well as for cell cultures; (3) microspectroscopy using a single element (SE) detector: a technique used for analyzing sections at low spatial
resolution; and (4) microspectroscopy using a focal plane array (FPA) detector: a technique for rapid
chemical profiling of plant sections at cellular resolution. Sample preparation, measurement, and data
analysis steps are listed for each of the techniques to help the user collect the best quality spectra and prepare
them for subsequent multivariate analysis.
Key words Fourier transform infrared spectroscopy, Methods, Microspectroscopy, Chemical
composition, Multivariate analysis, Plant, Attenuated total reflectance, Diffuse reflectance, Focal
plane array detector
1
Introduction
It is not surprising that Fourier transform infrared (FT-IR)
spectroscopy has gained popularity in plant sciences in the past
years [1–7] as it has numerous advantages in the chemical analysis
of a wide range of plant materials. It is nondestructive, fast, inexpensive, sensitive, and easy to customize and automate. It provides
information on the entire chemical profile of the investigated
sample and can be used on intact tissues for in situ analysis. With
microscopic accessories, even the spatial distribution of compounds
can be studied and visualized.
FT-IR spectroscopy probes functional groups in the sample.
In plants, which contain a mixture of chemically related components,
Jose J. Sanchez-Serrano and Julio Salinas (eds.), Arabidopsis Protocols, Methods in Molecular Biology, vol. 1062,
DOI 10.1007/978-1-62703-580-4_18, © Springer Science+Business Media New York 2014
317
318
András Gorzsás and Björn Sundberg
this information is rarely diagnostic for a particular compound.
Instead, FT-IR spectroscopy provides a chemical fingerprint of the
sample composition. As such, it is well suited for high-throughput
screens aiming to classify a large number of samples according to
their overall chemical profile or to identify samples with modified
chemical composition (i.e., mutant screens). Although some qualitative and quantitative information is gained about the chemical
composition of the sample, complementary analytical techniques
are required for more detailed information, for example, Raman,
UV–VIS, and NMR spectroscopies; wet chemical analyses; or mass
spectrometry. The present chapter will focus on FT-IR spectroscopy, both as a high-throughput technique (diffuse reflectance and
attenuated total reflectance measurements, Subheadings 1.1 and
1.2) and as a low-throughput tool for spatially resolved [8] sampling (microspectroscopy, Subheading 1.3). More advanced uses
of FT-IR spectroscopy, for example, two-dimensional correlation
spectroscopy [9, 10] and multivariate imaging [7], will not be
described here.
FT-IR spectroscopy is based on molecular vibrations, i.e., the
displacement of atoms from their equilibrium positions. The different ways a molecule can vibrate are called vibrational modes.
Mid-infrared (mid-IR) radiation (the 400–4,000 cm−1 region of
the electromagnetic spectrum) often causes a transition of the
molecular vibrational modes from the ground state to first excited
state (fundamental transition). A detector measures the intensity
difference between the original radiation (I0) and the radiation
after interaction with the sample (I). The spectrum is the plot of
intensity changes as a function of frequency (or wavenumber, as is
often used in FT-IR spectroscopy, with units of cm−1; see Note 1).
Qualitative information is obtained by analyzing the positions of
peaks (bands) in the spectrum. Many vibrational modes involve the
displacement of only a few atoms, while the rest of the molecule
can be considered relatively stationary. The position of a band is
therefore characteristic for a set of atoms and bonds (chemical
functional groups). These are called characteristic group frequencies and are traditionally given in charts for a set of compounds
([11], see Note 2). The exact position of the bands depends on
several factors, including bond strength and the reduced mass of
the atoms involved (see Note 3). Thus, a functional group will
produce a band within a frequency range, with the exact position
depending on the rest of the molecule in which this functional
group exists. Since the positions of bands in the spectrum are
indicative of the functional groups present in the sample, positional
changes (shifts) of bands indicate changes that affect that particular
functional group in the molecule. These shifts can be the results
of chemical or structural changes, for example, protonation/
deprotonation, formation or breakage of H-bonds, and protein
alpha-helix/beta-sheet structural changes. In addition, composite
vibrational modes of a molecule (where larger sets of atoms or the
FT-IR Spectroscopic Techniques
319
Fig. 1 Reference spectra of pectin (solid line), cellulose (dashed line), lignin (dotted line) and xylan (dash-dot line)
illustrating band positions, widths and overlaps for major cell wall components
entire molecule vibrates) can also give rise to bands in the mid-IR
spectrum. Taken together, this means that every infrared active
chemical compound (see Note 4) has an infrared spectroscopic fingerprint, which is unique and can be used for qualitative analysis or
detection. Plant material is mostly composed of cellulose, hemicelluloses, lignins, pectins, lipids, waxes, and proteins (see Note 5).
Many of these compounds contain similar functional groups (such
as –C–H, –C–O, and –O–H) that reside in very similar chemical
environments. This results in the broadening and overlaps of the
bands, which are seldom diagnostic and thus difficult to assign to a
particular compound (Fig. 1, see Note 6). Therefore, care must be
taken not to over-interpret the qualitative information from plant
FT-IR spectra.
Quantitative information is gained from band intensities via
the Bouguer–Beer–Lambert law (see Note 7). However, due to
natural and experimental variations in the case of plant materials,
spectra must be normalized before comparing different samples
(see Subheading 3.1.3 and notes therein for more details). As a
result of normalization, the observed compositional changes always
reflect proportional changes when comparing samples, and not
absolute amounts (semiquantitative analysis, see Note 8). After
normalization, the average spectra of different samples can be
compared, and differences in band intensities can be estimated
from band heights or areas or by creating a differential spectrum.
However, such comparisons are inherently problematic because
they do not consider variations between replicates (see Note 9).
Moreover, as mentioned above most bands in the spectra of plant
320
András Gorzsás and Björn Sundberg
material are not diagnostic on their own (Fig. 1, see Note 6), and
therefore, a set of bands, or preferentially the whole spectrum,
should be used for interpretation. Consequently, the best way to
analyze FTIR spectra of plant tissues is by multivariate tools, which
can handle experimental variation and use the full spectral region
in the analysis. An unsupervised principal component analysis
(PCA [12]) is often enough when the data is of high quality, and
differences between samples are substantial. When that is not the
case, however, the initial PCA analysis should be followed up with
a more powerful but supervised analysis, such as orthogonal projections to latent structures discriminant analysis (OPLS-DA [13]).
The multivariate analysis will reveal if there are outliers among the
samples that distort the data and how effectively the FT-IR spectroscopic profiles can be used to classify samples and the bands that
contribute to the differences between samples.
1.1 Diffuse
Reflectance Infrared
Fourier Transform
Spectroscopy
Diffuse Reflectance Infrared Fourier Transform Spectroscopy
(DRIFTS), in a simplistic view, means the analysis of powders by
focusing infrared light onto the powder and collecting the diffusely
reflected (scattered) light while minimizing the contribution of
specular reflection. Thus, DRIFTS spectra can be considered more
like transmission and not reflectance spectra.
The major advantages of DRIFTS are sensitivity, speed, and
cost. Sample preparation involves homogenization and is relatively
straightforward. It involves ball milling, manual grinding, and mixing of IR transparent diluters (typically KBr). As a consequence of
the addition of the diluter, the sample does not remain completely
intact for further analysis. In addition, homogenization must be
performed in a standardized way as it will affect the final spectra
(particle size effects [14]; degree of polymerization [15]). On the
other hand, it is possible to automate measurements via, e.g., sample carousels or well plates, thereby increasing throughput considerably and making DRIFTS a powerful technique that excels at
rapid chemotyping and screening. It should be noted that due to
the sensitivity of DRIFTS, normal variation (biodiversity) can be
substantial in the spectra. Filtering this variation away usually
requires standardization and powerful multivariate analysis (see above
and also Subheading 3.1.3).
1.2 Attenuated Total
Reflectance
Attenuated total reflectance (ATR) is a surface-sensitive infrared
spectroscopic technique. Essentially, the sample is placed on an infrared transparent crystal with high refractive index (internal reflection
element, IRE). Infrared radiation is totally internally reflected at the
interface between the IRE and the sample, and the radiation penetrates the sample in the form of an evanescent wave that decreases in
intensity exponentially with distance. The exact penetration depth
depends on the angle of incidence, the refractive index differences
between the sample and the IRE, and the wavelength of the light.
FT-IR Spectroscopic Techniques
321
For plant materials on a Ge, ZnSe, or diamond IRE, the penetration
dept is a few microns at most. Often, the IRE is shaped in such a way
that the infrared light makes multiple reflections before exiting
towards the detector. If the sample absorbs the light, then the intensity is attenuated, hence the name of the technique. In order to get
acceptable signals, the sample should make perfect good contact
with the IRE. Contact is not a problem for most liquids that wet the
surface of the IRE, but solids require pressure to be applied to force
the material against the IRE.
ATR objectives are available for FT-IR microscopes (increasing
the possible spatial resolution, see Subheading 1.3). In the absence
of a microscopy accessory, standard FT-IR spectrometers can be
equipped with more affordable standalone ATR accessories that
have their own built-in cameras. These allow the identification of
various areas on the sample surface (such as different areas of a leaf)
for measurements, but no mapping possibilities.
ATR measurements are nondestructive (unless the required
pressure damages the plant tissue). Sample preparation is simple.
Since it does not require drying, milling, or mixing, the sample can
be placed as it is onto the IRE and pressed against it. However,
ATR only provides information about the chemical composition of
a 1–5 μm layer of the sample that is in contact with the IRE, which
may not be ideal for certain applications. Moreover, the ATR signal
is usually weaker than the ones obtainable via DRIFTS, and the
detection limits are therefore generally lower. ATR is also more
difficult to automate than DRIFTS measurements, and therefore,
it may be less suited for large-scale experiments.
1.3 FT-IR
Microspectroscopy
A microscopy accessory attached to an infrared spectrometer
enables spatially resolved sampling and visualization of chemical
profiles across sections or surface tissues. Several measurement
modes are available for infrared microscopes, for example, transmission (see Note 10), reflection (see Note 11), and ATR (see Note
12), which makes this a very versatile technique. The ATR technique provides in theory the highest spatial resolution due to the
refractive index of the IRE (and thus the numerical aperture of the
ATR objective). However, the force required for good contact
between the IRE and the sample (Subheading 1.2) inevitably
causes substantial damage to plant tissues from the small tip of the
ATR objective. Furthermore, tissue fragments stuck on the IRE
require frequent and rigorous cleaning to limit the risks of carryover contamination, making this technique very impractical for
mapping applications. Therefore, we will here only focus on transmission and reflection mode microspectroscopy.
The detectors available for infrared microscopes are the standard single element (SE) detector and the more advanced focal
plane array (FPA) detector. Both are liquid nitrogen-cooled
HgCdTe (MCT) devices, but they have very different capabilities
322
András Gorzsás and Björn Sundberg
Fig. 2 Visible light snapshots of a 20 μm thick Col-0 Arabidopsis stem, showing the view field of the infrared
microscope. Scale bars are 50 μm. (a) Unmodified view field. The single element (SE) detector records a single
spectrum across the entire area. This spectrum represents the average chemical composition of all cell types
within this area. A 64 × 64 focal plane array (FPA) detector will record 4,096 spectra across the same area.
Thus, spectra from individual cells can be extracted. (b) Knife-edge apertures applied to limit the view field for
the SE detector. Only the diagonal rectangular area in the center is measured. Consequently, the single spectrum recorded by the SE detector represents the average composition of xylem fibers and vessels
and operational routines. The SE detector records a single spectrum
of the entire view field, which represents the average chemical
composition of that area. Knife-edge apertures are generally applied
to limit the image area to be measured (Fig. 2). The smallest area
that can routinely be measured this way is ca. 50 μm × 50 μm
(see Note 13). An FPA detector consists of an array (see Note 14)
of miniature detector elements in the focal plane of the infrared
radiation (pixels). Each of these detector elements records a
spectrum individually and independently from the other elements
(no pixel cross talk). This means that thousands of spectra are
recorded simultaneously across the view field, much like pixels
building up an image. This type of measurement is therefore called
imaging, as opposed to mapping, which means that a series of spectra are recorded consecutively. The view field of a typical FPA
detector with 64 × 64 detector elements is ca. 175 μm × 175 μm
(see Note 15), and 4,096 spectra are recorded across this area.
Thus, the size of a detector element is about 2.7 μm × 2.7 μm. In
reality, however, the spatial resolution is about 10–20 μm for applications on plant samples (see Note 16). Since a spectrum is recorded
for each “pixel” of the image, FPA measurements will provide
cellular resolution without the need of apertures. This in turn
means better quality spectra at much higher speed.
SE detectors are easy to operate, require less computing power,
and generally less prone to errors (see Note 17), while FPA
detectors are more sensitive to disturbances such as vibrations
(see Note 18), and software problems (see Note 19). Since the differences between SE and FPA measurements are substantial not
FT-IR Spectroscopic Techniques
323
only in theory but also in practice, separate protocols are given for
their use (Subheadings 3.3 and 3.4).
FPA data has traditionally been evaluated by so-called heat
mapping, which means plotting the intensity (integral) or intensity
ratio of certain bands in the spectrum. This visualization is easy to
perform and interpret, but it should be noted that it is unreliable
in the case of plant materials. Heat maps suffer from scattering
effects, poorly resolved (overlapping) and nonspecific (nondiagnostic) bands, varying baseline slopes, and pixel coverages and are
prone to reflect artifacts rather than the true distribution of a
compound [7]. Alternatively, images can be created based on
multivariate analyses, using the spatial information in the data to
visualize the results (multivariate imaging [7]). Multivariate imaging is a powerful tool, which today represents state-of-the-art data
analysis in FT-IR spectroscopic profiling of plants. Protocols for
multivariate imaging are not provided here, since it is a complex
process and not yet a routine approach.
2
Materials
Chemicals
Since FT-IR spectroscopy and microspectroscopy do not require
staining or extraction, the only chemicals required are IR spectroscopy grade KBr (or other infrared transparent diluter) for DRIFTS
(Subheading 3.1.1), liquid N2 to cool HgCdTe (MCT) detectors
(Subheadings 3.3.2 and 3.4.2), and sample carriers (BaF2, CaF2,
ZnSe windows; gold and aluminum mirrors) for microspectroscopy measurements (Subheadings 3.3.1 and 3.4.1). The reference
spectra shown in Fig. 1 were recorded according to the DRIFTS
protocol given in Subheading 3.1, using the following compounds:
lignin isolated from wild-type poplar (P. tremula × P. Alba, courtesy
of J. Ralph, University of Wisconsin, Madison, WI, USA), xylan
from birchwood (Sigma-Aldrich, http://www.sigmaaldrich.
com/), cellulose powder for thin layer chromatography (CAMAG,
http://www.camag.com/), and pectin from citrus peel (SigmaAldrich, http://www.sigmaaldrich.com/).
2.2 Instrumentation
and Equipment
The protocols provided were developed on the following Bruker
instruments: IFS 66 v/S, Equinox 55, and Tensor 27, equipped
with Hyperion 3000 microscopy accessories and 64 × 64 focal
plane array (FPA) detectors (see Note 20). Experimental settings
are given for these systems, but other systems should behave
similarly. Thus, the values given here can be used initially, but finetuning of measurement parameters (number of scans, spectral
resolution, etc.) is needed for other instruments and samples.
In addition to standard laboratory equipments, a desiccator, a
vibrational ball mill (see Note 21), and mortar and pestle (see Note
22) are needed for DRIFTS measurements (see Subheading 3.1.1).
2.1
324
András Gorzsás and Björn Sundberg
Cryomicrotome (Microm HM 505 E) and vibratome (Leica VT
1000 S) were used to prepare sections for microspectroscopy
(see Subheading 3.3.1). For storage and measurement of sections,
standard microscopy glass slides (see Note 23), infrared transparent windows (see Note 24), or mirrors (see Note 25) are needed.
2.3
3
Software
The measurement and data analysis steps have been developed on
standard PCs running Microsoft Windows XP and Windows 7
operating systems, using Bruker’s OPUS Software (Bruker Optik
GmbH, http://www.brukeroptics.com/, versions 5.5–6.5, with
notes detailing changes relevant to the newly released version 7).
Other manufacturers provide their own software bundles. However,
finding and adjusting parameters (number of scans, spectral resolution, etc.) in those software bundles should be straightforward.
For data analysis, most software allow data export in ASCII format
(data point tables in OPUS), which can then be opened in, e.g.,
Microsoft Excel. For multivariate analysis, the ASCII files were
combined into a MATLAB matrix (.mat) file, which was processed
by SIMCA-P + (version 12, Umetrics AB, Umeå, Sweden).
Methods
3.1 Diffuse
Reflectance Infrared
Fourier Transform
Spectroscopy (DRIFTS)
3.1.1 Preparation
Samples should be in the form of dry powders.
1. Freeze-dry the samples for 24 h and store them in a moisturefree environment until further processing.
2. Powderize the samples by ball milling using the following
procedure. Add 50 mg sample into a tube of the vibration
mill, add two 12 mm diameter stainless steel balls in each tube,
and mill at 30 Hz for 120 s (see Note 26). In addition, KBr
(see step 3 below) can be added and mixed during the milling
procedure. This will help manual grinding (see step 4 below)
and also absorb a substantial part of the generated heat.
Cleaning the vibration mill tubes is performed by washing
with water and ethanol and then blowing them dry with compressed air.
3. Samples must be mixed with KBr (see Note 27), because
undiluted plant materials absorb too much IR light. The total
weight (sample + KBr) should be 400 mg, of which the dry
sample should be 1–10 mg and KBr 399–390 mg (see Note
28). Mixing the sample with KBr can be performed prior to
ball milling (to limit the effect of generated heat and help
manual grinding; see steps 2 and 4, respectively). If KBr is
added before ball milling, 50 mg dry sample requires approximately 2 g KBr.
4. The final step of sample preparation is the manual grinding of
the mixture (sample and KBr). This should be done even if
FT-IR Spectroscopic Techniques
325
ball milling was performed on the mixture to obtain a properly
homogenized sample with suitable particle size. This is difficult to achieve by ball milling alone without burning the sample.
Pure KBr appears crystalline, much like common table salt.
The properly ground mixture should be a fine powder
that appears like flour. This is the most laborious and timeconsuming step.
5. Load the sample mixture into the sample container cup. Make
sure the surface is flat after loading and that the sample mixture is not compressed into the container cup.
6. For background measurement, prepare pure KBr in the same
way as the sample + KBr mixtures.
3.1.2 Measurement
For the best quality spectra, band intensities should ideally be
between 0.3 and 0.8 Abs units and the signal-to-noise ratio high.
First, the background (pure KBr, see step 6 in Subheading 3.1.1
above) should be recorded, then the samples, using the same
parameters as the background (see Note 29).
1. Start OPUS and click on the “Advanced Measurement” icon.
In the “Basic” or “Advanced” tab of the dialog window, click
on “Load” and chose an already existing .XPM file for diffuse
reflectance measurements (for the first time, use a generic
XPM file that comes bundled with the instrument).
2. Set the number of scans to 128 (see Note 30) and spectral
resolution to 4 cm−1 (see Note 31). This will result in measurement times of about 2 min/sample (see Note 32) using a
standard 10 kHz scanner velocity (see step 6 below).
3. Set the spectral range to 400–4,000 cm−1 (see Note 33). Save
the sample and background interferograms by checking the
respective boxes on the “Advanced” tab during measurement
setup (see Note 34).
4. Use “Double-sided, forward–backward” acquisition mode
and automatic signal gains.
5. For Fourier transformation parameters, use a Blackman–Harris
3-term apodization function, the same frequency limits as
given for the spectral range (step 2 above), a phase resolution
of 32, Mertz-type phase correction with no peak search, and a
zero filling factor of 2 (see Note 35).
6. The Optic and Instrument Parameters should be set by default,
according to the type of instrument in use.
7. When all values are set, the experimental parameters can be
saved as an .XPM file for future measurements. This means
that only the file name needs to be given for further samples
(see Note 36).
326
András Gorzsás and Björn Sundberg
3.1.3 Data Analysis
There will always be variation between spectra that originate from
experimental (instrument) factors. To minimize the effects of
experimental variation on the data, spectra should be standardized
before analysis by the following steps:
1. Load all spectra (see Note 37) to be standardized in OPUS.
2. Select the “AB” block of all loaded spectra and perform baseline correction (“Manipulate” menu, “Baseline correction”).
Ideally, a two-point straight baseline should be created, spanning the entire spectral region that will be used for analysis.
If the baseline is not linear, use the standard OPUS option of
64-point rubber band baseline correction (“Manipulate”
menu, “Baseline correction” option, “Select method” tab),
excluding CO2 bands (see Note 38).
3. If spectra are still not overlapping at flat baseline areas (i.e., regions
where there are no bands), an “Offset correction” (“Manipulate”
menu, “Normalization” option, “Offset correction” method)
should be applied, using a flat baseline area for “Frequency
Range.”
4. After baseline and offset corrections, normalization needs to
be performed to make spectra fully comparable. Keep in mind
that after normalization, compositional changes between samples will reflect relative differences and not absolute amounts
(see Notes 8 and 39). Using area normalization is normally a
good strategy. This means that the area under all bands in the
spectral range of analysis is set to a constant value (100 %). In
OPUS, there is no built-in area normalization option; the
closest is “Vector Normalization” (“Manipulate” menu,
“Normalization” option. see Note 40).
5. Alternatively, Min–Max normalization (“Manipulate” menu,
“Normalization” option) can be used to set the intensity of a
(reference) band in the spectrum to a constant value (see Note
41). This will scale the spectra in the absorbance axis so that
the minimum and maximum values in the frequency range will
be constant (see Note 42).
6. Standardized (baseline corrected and normalized) spectra can
be compared directly in OPUS to detect substantial changes.
However, to take full advantage of the data, multivariate
analysis is highly recommended. This requires that the spectra
are exported to a format which can be read in by the multivariate
software. For this task, OPUS offers the possibility of exporting
in Jcamp (.dx) file format as well as in standard tab or space
delimited ASCII data point table (.dpt) files (see Notes 43
and 44) (“File” menu, “Save File As” option, “Mode” tab).
Unfortunately, each file needs to be saved and exported
individually (see Note 45).
FT-IR Spectroscopic Techniques
3.2 Attenuated Total
Reflectance (ATR)
3.2.1 Preparation
3.2.2 Measurement
327
ATR measurements require virtually no sample preparation:
1. Select the appropriate IRE (see Note 46).
2. Make sure there is good contact between the sample and the
crystal. For solids, apply the maximum feasible pressure that
will not damage the sample or the IRE.
Most of the measurement parameters are identical to those
described for DRIFTS (Subheading 3.1.2) and will not be detailed
here. However, the signal in ATR measurements is often weaker,
which necessitates longer measurement times (higher number of
scans) than DRIFTS. Another difference is that the background
measurement is done on either the empty IRE, or in case of solutions, using the solvent as background (see Note 47).
1. Start OPUS and click on the “Advanced Measurement” icon.
In the “Basic” or “Advanced” tab of the dialog window, click
on “Load” and choose an already existing .XPM file for attenuated total reflectance measurements (for the first time, use a
generic XPM file that comes bundled with the instrument).
2. Record the background, with the same parameters as for the
sample.
3. Record the sample, using the same initial parameters as in the
case of DRIFTS measurements (Subheading 3.1.2). Save the
sample and background interferograms by checking the
respective boxes on the “Advanced” tab during measurement
setup (see Note 35).
4. The only difference from DRIFTS setup should be under the
“Optic and Instrument Parameters” tab, which should be set
by default to the correct ATR settings matching the type of
instrument in use.
5. If the signal-to-noise ratio is low, increase the number of scans
(see Notes 31 and 32). When overall absorbance values are
too low (i.e., most bands having intensity values below 0.3
Abs unit), the sample should be “concentrated.” For solutions,
this can be achieved by either increasing the concentration or
by repeatedly depositing the sample on the IRE and evaporating
the solvent (see Note 48). For solids, the only way to improve
signal strength is by a better contact with the IRE, which is
achieved by applying higher pressure.
3.2.3 Data Analysis
Data analysis is done in the same way as in the case of DRIFTS,
follow steps 1–6 in Subheading 3.1.3.
328
András Gorzsás and Björn Sundberg
3.3 Microspectroscopy Using a
Single Element (SE)
Detector
3.3.1 Preparation
When plant sections are analyzed, care must be taken to produce
sections that are non-scattering (staying flat on the carrier) and
thin enough for the infrared light to pass through but thick enough
for the anatomical features to remain intact (no collapsing, folding,
tearing, or cracking). Highest-quality data is obtained by using
transmission mode [16] but that also requires the most challenging sample preparation as Arabidopsis sections are very fragile.
1. Sections should be about 10–20 μm thick (see Note 49). They
can be obtained from frozen material with a cryotome
(see Note 50), from fresh material with a vibratome (see Note
51), or from paraffin-embedded material with a microtome
(see Note 52). The optimal thickness value depends on the
material at hand (cell density, wall thickness, and other physical and optical properties). From the spectroscopic point of
view, sections are ideal when the spectra recorded contain
absorbance values between 0.3 and 0.8 and high signalto-noise ratios (see Note 53).
2. Mount the sections onto carriers (see Note 54): standard
microscopy glass slides (see Note 23), infrared transparent
windows (see Note 24), or mirrors (see Note 25). To keep the
sections flat, they can be sandwiched between two carriers.
3. Place the mounted sections into desiccators for drying for at
least 48 h (see Note 55).
4. Sections dried on standard microscopy glass slides need to be
transferred to infrared transparent windows (see Note 24) or
mirrors (see Note 25) prior to measurements. Transfer is done
by scraping the sections off from the glass slide by a razor
blade and gently placing them onto the new carrier.
3.3.2 Measurement
The numerical values listed below are suggested initial values that
normally provide good results for most types of Arabidopsis sections. However, fine-tuning for individual samples and instruments
is always necessary. We provide two different methods for measurements. Method 1 allows for the definition of four sample positions
at a time (see Note 56) and records each as individual files.
Alternatively, Method 2 allows many measurement positions to be
defined and recorded consecutively within the same measurement.
In this case, all positions will be recorded and kept in the same file,
as individual blocks (see Note 57). The major advantage of Method
2 is speed and automation, as the user does not have to manually
set up and start each measurement individually. The drawbacks are
as follows: (a) the same background will be used for all positions.
In case of problems with that background (water vapor, vibrations,
etc.), all measurements will be affected, and this will only be obvious after all measurements are done; (b) all positions will be
recorded with the same parameters, and there is no possibility of
FT-IR Spectroscopic Techniques
329
fine-tuning parameters for each sample position separately; (c) all
positions must be in the same focus; and (d) the same apertures
will be used for all positions.
Method 1:
1. Cool the detector by filling it with liquid N2 until the red indicator light is switched off.
2. Place the sample mounted on the appropriate carrier
(Subheading 3.3.1, step 4) onto the sample tray (see Note 58).
3. Select transmission or reflection mode on the foot of the
microscope accessory. Make sure you are using visible and not
infrared light, and adjust brightness and focus if needed.
4. In OPUS, change detector (select beam path) to single
element.
5. In the “Measure” menu, select “Video Assisted Measurement.”
6. In the “Basic” or “Advanced” tab of the dialog window, click
on “Load” and chose an already existing .XPM file for single
element transmission or reflectance measurements (for the
first time, use a generic XPM file that comes bundled with the
instrument).
7. In the “Advanced” tab, specify filename and path. Set the
number of scans between 32 and 512 for both background
and sample and spectral resolution to 4 cm−1 (see Notes 31
and 59). For the “save data” parameters, give an upper limit
of 4,000 cm−1 (see Note 34) and a lower limit depending on
the cutoff edge of the carrier (400 cm−1—no cutoff—for
mirrors in reflectance mode, 550 cm−1 for ZnSe, 850 cm−1 for
BaF2, and 1,050 cm−1 for CaF2). Save all data blocks (see Note
35) and choose the resulting spectrum as Absorbance.
8. In the “Optic” and “Acquisition” tabs, keep the default
parameters, as these are instrument specific (gains and scanner
velocities, high- and low-pass filters, etc.). The “Double-sided,
forward–backward” acquisition mode should be selected by
default.
9. In the “Fourier transform” tab, select the “Blackman–Harris
3-term” apodization function, the “Power/No Peak Search”
phase correction mode, and a zero filling factor of 2. Phase
resolution should be left at default (16 or 32).
10. In the “XY stage” tab, make sure the joystick is activated.
Calibration of the stage may be necessary if the stage does not
move to positions during measurements.
11. In the “Check Signal” tab, make sure there is an interferogram
with acceptable counts, depending on the instrument condition and apertures applied (at least 8,000 counts on an empty
spot on the carrier, i.e., no sample).
330
András Gorzsás and Björn Sundberg
12. Save the XPM file under a descriptive name (“Advanced” tab,
“Save” button).
13. In the “Basic” tab, select “Start Video Assisted Measurement.”
This will close the dialog window and opens the measurement
workspace, dominated by the Live Video Pane in the center.
In the Live Video Pane, the red rectangle outlines the CCD
area, while the green square with the crosshair shows the
actual measurement area (see Note 60).
14. Move the tray with the joystick to an empty spot on the carrier.
Right-click anywhere inside the green square in the Live Video
Pane and select “Defining Positions…” then “Background
Position” in the opening contextual menu (see Note 61).
15. Move the tray with the joystick to the first sample position.
Right-click anywhere inside the green square in the Live Video
Pane and select “Defining Positions…” then “Load Position
1” in the opening contextual menu (see Note 57).
16. Set the apertures if necessary. Adjust focus and light intensity
if necessary. Save the visible image by right-clicking anywhere
inside the green square in the Live Video Pane and selecting
“Video Image…” then “Snapshot” in the opening contextual
menu (see Notes 62–64).
17. Move to the Background Position (see Note 65), adjust
focus if necessary, and change from visible to infrared light
(see Note 66).
18. Right-click anywhere inside the green square in the Live Video
Pane and select “Starting Measurement…” then “Collect
Background at Current Position” in the opening contextual
menu (see Note 67).
19. When the background measurement is finished, change to
visible light and move to the predefined sample position (Load
Position 1). Adjust focus if necessary and change to infrared
light (see Notes 66 and 67).
20. Right-click anywhere inside the green square in the Live Video
Pane and select “Starting Measurement…” then “Measure
Current Position” in the opening contextual menu.
21. When the measurement is finished, change the light to visible
(see Note 68) and close the measurement workspace to save
the file (see Note 69).
Method 2:
Steps 1–14 are identical to Method 1.
22. If more than one sample position has been defined in step 15
(see Note 57), start a new “Video Assisted Measurement”
(in the “Measure” menu of OPUS), use the same XPM file as
FT-IR Spectroscopic Techniques
331
before (only changing the file name in the “Advanced” tab),
start the measurement workspace (step 13), move to the next
sample position to be measured (see Note 66), and repeat
from step 16 (see Note 30).
23. Adjust focus and light intensity. Create an overview image by
right-clicking anywhere inside the green square in the Live
Video Pane and selecting “Video Image…” then “Set + Scan
Overview Image Area” in the opening contextual menu
(see Note 65).
24. Right-click in the Live Video Pane and select “Measurement
Spots/Grid…” and “Mark Measurement Positions.” The cursor changes, and left-clicking will mark a position by placing a
“+M” sign on the image.
25. Move around the sample using the joystick and mark all
positions to be measured (see Notes 70 and 71).
26. Move to the Background Position (see Note 66) and change
from visible to infrared light (see Note 67).
27. Right-click anywhere inside the green square in the Live Video
Pane and select “Starting Measurement…” then “Collect
Background at Current Position” in the opening contextual
menu (see Note 68).
28. When the background measurement is finished, right-click
anywhere inside the green square in the Live Video Pane and
select “Starting Measurement…” then “Measure Marked
Positions” in the opening contextual menu. This will measure
all positions, in the order they were marked. This order will be
the order of the data blocks as well (see Note 58). The already
measured positions have a checkmark sign to differentiate
them from those yet to be measured.
29. When the measurements are finished, change the light to
visible (see Note 69) and close the measurement workspace to
save the file (see Note 70).
3.3.3 Data Analysis
Measurements using the single element detector result in 3D files,
which contain one (Method 1 in Subheading 3.3.2) or more spectra (Method 2 in Subheading 3.3.2). To extract these spectra, open
the 3D file in OPUS and follow the steps below (see Note 72):
1. Select the “AB” block of the 3D file in OPUS.
2. In the “Measure” menu, select the “Extract data” option.
3. In the dialog window that opens, specify a filename and path
for the spectrum to be extracted (see Note 73) in the “Select
Files” tab.
4. In the “Extension Range” tab, select from “Beginning of file”
to “End of file.”
332
András Gorzsás and Björn Sundberg
5. In the “Extraction Mode” tab, select “Series of single blocks”
(see Note 74) to be stored, the “Increment name” option
under “If file already exists,” and the “Load” option under
“Extracted files” (see Note 75).
Finally, the extracted spectra are analyzed exactly the same way
as spectra recorded by DRIFTS or ATR methods: follow steps 1–6
in Subheading 3.1.3.
3.4 Microspectroscopy Using
a Focal Plane Array
(FPA) Detector
Samples are prepared in the same way as for microspectroscopy
using a single element detector (Subheading 3.3.1).
3.4.1 Preparation
3.4.2 Measurement
The parameters listed below are initial values that will provide
good results for most types of Arabidopsis sections. However, it is
necessary to fine-tune them for individual samples and instruments
to achieve the best possible spectrum quality:
1. Cool the detector by filling it with liquid N2 (see Note 76).
2. Switch on the FPA detector.
3. Place the sample mounted on the appropriate carrier
(Subheading 3.3.1, step 4) onto the sample tray (see Note 59).
4. Select transmission or reflection mode on the foot of the
microscope accessory. Make sure you are using visible and not
infrared light, and adjust brightness and focus if needed.
5. Start OPUS and change detector (select beam path) to FPA.
6. In the “Measure” menu, select “Continuous Scan FPA
Measurement.”
7. In the “Basic” or “Advanced” tab of the dialog window, click
on “Load” and chose an already existing .XPM file for FPA
transmission or reflectance measurements (for the first time, use
a generic XPM file that comes bundled with the instrument).
8. In the “Advanced” tab, specify filename and path. Set the
number of scans to 32 for both background and sample, and
set the spectral resolution to 4 cm−1 (see Notes 31 and 60).
For the “save data” parameters, give the upper limit of
4,000 cm−1 (see Note 34) and the lower limit of 900 cm−1
(or 1,050 cm−1 if a CaF2 window is used as the carrier,
see Notes 24 and 77). Save all data blocks (see Note 78) and
chose the resulting spectrum as Absorbance.
9. In the “Optic” and “Acquisition” tabs, keep the default
parameters, as these are instrument specific (gains and scanner
velocities (see Note 79), high- and low-pass filters, etc.). The
“Double-sided, forward–backward” acquisition mode should
be selected by default.
FT-IR Spectroscopic Techniques
333
10. In the “Fourier transform” tab, select the “Blackman–Harris
3-term” apodization function, the “Power/No Peak Search”
phase correction mode, and a zero filling factor of 2. The
phase resolution should be left at the default value (16 or 32).
11. In the “XY stage” tab, make sure the joystick is activated.
Calibration of the stage may be necessary if the stage does not
move to positions during measurements. However, this cannot
be done once the measurement has been started (step 14).
12. The “Check Signal” tab has a unique display that is specific for
the FPA detector. Instead of an interferogram, it contains a
scatter plot. The dots represent the maximum intensity count
of the interferogram at each pixel. In addition, several parameters for the FPA setup can be accessed here. Keep the default
values for frame rate and integration. Click on the “Diagnostics”
button for an exact readout of the “FPA temperature” (listed
in the bottom row of parameters).
13. Save the XPM file under a descriptive name (“Advanced” tab,
“Save” button).
14. In the “Basic” tab, select “Start Video Assisted Continuous
Scan FPA Measurement.” This will close the dialog window and
open the measurement workspace, which is dominated by the
Live Video Pane in upper part and the Live FPA Image Pane in
the lower part. In the Live Video Pane, the red rectangle outlines the CCD area, while the green square with the crosshair
shows the actual FPA measurement area (see Note 61).
15. Move the tray with the joystick to an empty spot on the carrier.
Right-click anywhere inside the green square in the Live
Video Pane and select “Defining Positions…” then select
“Background Position” in the opening contextual menu
(see Note 62).
16. Move the tray with the joystick to the first sample position.
Right-click anywhere inside the green square in the Live Video
Pane and select “Defining Positions…” then “Load Position
1” in the opening contextual menu (see Note 57).
17. Adjust focus and light intensity if necessary (see Note 80).
Save the visible image by right-clicking anywhere inside the
green square in the Live Video Pane and selecting “Video
Image…” then “Snapshot” in the opening contextual menu
(see Notes 63–65).
18. While still at the sample position, change from visible to infrared light (see Note 67). Note that the Live FPA Image updates.
Ideally, anatomical sample features should be recognizable.
19. Right-click in on the Live FPA Image and select “Setup FPA
Detector…” then “Show Control Panel” in the opening contextual menu (see Note 81). Set the gain to 1 or 2 (see Note
82). Adjust the value for the offset so that all dots in the
334
András Gorzsás and Björn Sundberg
scatter plot fall between ca. 4,000 and 12,000 counts, i.e.,
between ¼ and ¾ of the total intensity range scale. Ideally, no
pixels should have 0 (minimum) or 16,383 (maximum) readouts (see Note 83). Close the dialog window by clicking “OK.”
20. If using transmission mode, adjust the condenser (i.e., focus
the infrared light; see Note 84) so that the Live FPA Image
shows maximum homogeneous illumination for as much of
the entire FPA area as possible (i.e., no side should be darker
than the others).
21. While still in infrared mode, move to the Background Position
(see Note 66), adjust focus and condenser (see Note 85) if
necessary. Check the Live FPA Image to make sure that the
gain and offset are correctly set, and adjust if necessary as
described in step 19.
22. Right-click on the Live FPA Image, select “Measurement”
then “Start Measurement” and “Background.” Click on the
“Start Measurement” button of the new window that appears
(see Note 85).
23. Wait until the measurement is done (see Note 86), then move
to a predefined sample position (see Note 66), and readjust
the focus and condenser (see Note 85) if necessary. Check the
Live FPA Image to make sure that gain and offset are correct,
and adjust if necessary as described in step 19.
24. Right-click on the Live FPA Image, select “Measurement” then
“Start Measurement” and “Sample.” Click on the “Start
Measurement” button of the window that appears (see Note 86).
25. When the measurement is done (see Note 87) change to
visible light (see Note 69) and close the measurement workspace to save the file (see Note 70).
26. If more than one sample position has been defined in step 16
(see Note 57), start a new “Start Video Assisted Continuous
Scan FPA Measurement” (in the “Measure” menu of OPUS),
use the same XPM file as before (only changing the file name
in the “Advanced” tab) to start the next measurement, move
to the next sample position to be measured (see Note 66), and
repeat from step 23 (see Note 30).
Marking many different positions and measuring them all as in
Method 2 for the SE detector (Subheading 3.3.2) is not possible
when using the FPA detector.
3.4.3 Data Analysis
Measurements using the FPA detector result in 3D files, which
contain all spectra in the order of their pixel number (called
“blocks”) in the image (see Note 87). To extract these spectra,
open the 3D file in OPUS and follow the steps below (see Note
73) for Method 1 (extracting all spectra from an image) or Method
2 (extracting spectra from selected pixels only):
FT-IR Spectroscopic Techniques
335
Method 1, extracting all spectra
1. Open the 3D file in OPUS and double-click on the “AB”
block. This will bring up the 3D window view.
2. In the “Evaluate” menu, choose “Integration.” If you have a
predefined integration method (e.g., “Arabidopsis Overview”;
see step 3), load it by clicking on the “Load Integration
Method,” click “Integrate,” and proceed to step 4. If not,
click on “Setup Method,” to define an integration method.
3. In the dialog window, choose the integral type “B,” for “Left
edge” set the value to 1,614 and for “Right edge” to 1,572,
and give the following Label: “lignin1595.” Click on the “>>”
button to define the next band: Type B, Left edge 1,520,
Right edge 2,485, Label “lignin 1510.” Click on the “>>”
button once again to define the next band: Type B, Left edge
1,755, Right edge 1,733, Label “-C = O 1740.” Click on the
“>>” button for the last time and define the next band: Type
B, Left edge 950, Right edge 1,180, Label “carbohydrates.”
Click on the “Store Method” button and save the integration
method under the name “Arabidopsis Overview.” Click “Exit”
to close the dialog window, return to the integration window
and click “Integrate.”
4. The integration produces a “TRC” data block, which serves
only as means for visualization. No qualitative analysis should
be based on the produced heat map (see Note 88).
5. In the “Window” menu, select “New Registered Window…”
and choose “Map + Vid + Spec” in the drop-down menu.
Clicking “OK” brings up a new 3D view split into three panes;
the two upper panes can be used to show visible images and
infrared maps (Image Panes), while the bottom pane shows
infrared spectra at the selected pixels (Spectrum Pane). Drag
the “TRC” block into this new 3D view (see Note 89).
6. Right-click on the right Image pane (see Note 90) select “XZ
Plot” and then choose “Properties.” In the dialog window
that opens, select “Show surface” and “Video Image” on the
“3D Properties” tab. On the “Mapping” tab, choose “-C = O
1740” (see Note 91) from the “Select trace” drop-down menu
and choose the correct visible image (if more than one snapshot was taken in step 17 in Subheading 3.4.2) in the “Select
image” drop-down menu. In the “Selection” tab, choose “X”
and “Z” from the “Show” drop-down lists. In the “Contour”
tab, select “Rainbow,” uncheck the “No colour splitting for
negative values” box, choose the lowest possible number from
the “Contours” drop-down menu (see Note 92) and choose
“Contour lines and colors” from the “Method” drop-down
menu. Click “Apply” and “OK” (see Notes 93 and 94).
336
András Gorzsás and Björn Sundberg
7. Repeat the entire step 6 on the left Image pane, but choose
“No contours” instead of “Contour lines and colors” from the
“Method” drop-down menu. Click “Apply” and “OK” (see
Note 95).
8. The red and green crosshair marks the position from which
the spectrum is shown in the bottom Spectrum pane. The
Spectrum pane also lists the pixel number (after “Index”) of
the spectrum (see Note 88) and has controls for moving the
crosshair (see Note 96). Move around the image to make sure
spectra look reasonable and note the positions of bad pixels, if
any. These should be excluded from further data analysis.
9. Right-click on any of the Image panes and select “Extract
Spectra.”
10. In the dialog window that opens, specify a filename and path
for the spectrum to be extracted (see Note 74) in the “Select
Files” tab.
11. In the “Extension Range” tab, select from “Beginning of file”
to “End of file.”
12. In the “Extraction Mode” tab, select “Series of single blocks”
(see Note 97) to be stored, the “Increment name” option under
“If file already exists,” and the “Do not load” option
under “Extracted files” (see Note 98).
Finally, the extracted spectra are treated the same way as spectra recorded by DRIFTS or ATR methods: follow steps 1–6 in
Subheading 3.1.3.
Method 2, extracting only selected spectra
13. Follow steps 1–7 of Method 1 above.
14. Move to the pixel from which the spectrum should be extracted
(see Notes 97 and 99).
15. Right-click on any of the Image panes and select “Extract
Spectra.”
16. In the dialog window that opens, specify a filename and path
for the spectrum to be extracted (see Note 74) in the “Select
Files” tab. Use the pixel number (“Index” value in the
Spectrum pane, see Note 88) in the filename for easy identification of the origin of the spectra.
17. In the “Extension Range” tab, select from “Block” and give
the pixel number of the spectrum to be extracted (the default
value is the pixel of the spectrum shown in the Spectrum pane,
i.e., the one at the crosshair position).
18. In the “Extraction Mode” tab, select “First block only”
(see Note 100) to be stored, the “Increment name” option
under “If file already exists,” and the “Load” option under
“Extracted files” (see Note 76).
FT-IR Spectroscopic Techniques
337
19. Repeat steps 14–18 for extracting additional spectra from the
same image.
20. Repeat steps 13–19 for additional spectra from another
image.
Finally, the extracted spectra are treated the same way as spectra recorded by DRIFTS or ATR methods: follow steps 1–6 in
Subheading 3.1.3.
4
Notes
1. Spectra can be plotted as transmittance (T %) or absorbance
(Abs). T = I/I0, while Abs = log10(1/T). While Abs spectra are
more common, early works used T % more frequently
(particularly before the Fourier transform revolution). In
addition, older publications use wavelength on the x-axis
(and µm as units), and not wavenumbers (with cm−1 units)
as is common nowadays. The conversion can be easily done
by the following formula: ν = 10,000/λ, where ν is the
wavenumber (in cm−1) and λ is the wavelength (in µm).
2. Often these charts do not only display the wavenumber range
in which a functional group can produce a band, but also
show information about band shape (narrow, broad, shoulder) and intensity (weak, medium, strong).
3. The effect of bond strength on the characteristic frequencies
can be illustrated by the positional change of the intensive
–C = O stretching vibration in formaldehyde (H2C = O, ca.
2,053 cm−1) and acetone ((CH3)2C = O, ca. 1,731 cm−1). The
effect of the change in atomic mass on characteristic frequencies can be even larger (several hundreds of cm−1 when
exchanging hydrogen with deuterium, for instance).
Therefore, isotope exchange can be used to identify the origin of a band or to shift a band to a different position to
avoid overlaps. If a sample is repeatedly washed with D2O,
the accessible Hs will be exchanged to Ds. Thus, if a –C–O
band originates from an alcohol, it will shift considerably
upon deuteration (–C–OH to –C–OD change), as opposed
to a –C–O band of an ether or ester.
4. Only molecules with a dipole moment (permanent or
induced) produce infrared active vibrations. Thus, pure
diatomic gases (N2, O2, etc.) are infrared silent, and the major
atmospheric disturbance in infrared spectra is caused by H2O
and CO2.
5. Normally, water would also be present, but it produces very
intense bands in FT-IR spectroscopy that can obscure
important parts of the spectrum (e.g., around 1,600 cm−1
where characteristic lignin and protein bands are situated).
338
András Gorzsás and Björn Sundberg
Thus, water must be removed prior to analysis by freezedrying or desiccation. The only general exception to this
rule is the ATR technique, which can handle wet samples
(see Subheading 1.2).
6. The most notable exception is perhaps the aromatic –C = C–
functionality present in lignins and monolignols, giving rise
to bands around 1,510 and 1,595 cm−1 [7, 17]. These positions are seldom obscured by other bands (although absorbed
water and proteins that give rise to bands at around
1,650 cm−1 can mask part of the 1,595 cm−1 band), and lignins/monolignols are often the only aromatic compounds in
large enough quantities to produce significant –C = C– bands
in the spectrum.
7. The Bouguer–Beer–Lambert law can be written as follows:
T = I/I0 = 10−εlc, where ε is the molar absorptivity coefficient,
L is the path length of the light in the sample, and c is the
concentration of the absorbing material. Since
Abs = log10(1/T) = εlc, it means that the absorbance (and not
the transmittance) is linearly correlated to the concentration.
In addition, due to the differences in ε, direct comparison
between band intensities is difficult. For instance, an intensity
of 0.25 Abs unit for the –C = C– and 0.5 Abs unit for the
–C–O–C– band does not mean that there are twice as many
–C–O–C– as –C = C– functionalities in the sample. When
monitoring the same band, however, the intensity change
can be used for determining concentration changes after
calibration.
8. For a simple theoretical example consider the following
1.5 mg samples: Sample A: 0.5 mg cellulose, 0.5 mg lignin,
0.5 mg other; Sample B: 0.25 mg cellulose, 0.5 mg lignin,
0.75 mg other; Sample C: 0.5 mg cellulose, 1.0 mg lignin;
Sample D: 0.3 mg cellulose, 0.6 mg lignin, 0.6 mg other.
After normalization, samples B, C, and D will all show a substantially increased lignin to cellulose ratio when compared
to Sample A. However, from the normalized spectra alone, it
is impossible to determine whether the lignin to cellulose
ratio increased because the cellulose content decreased
(Sample B) or because the lignin content increased (Sample
C), or both (Sample D).
9. To include sample variation during the direct comparison of
average spectra, standard deviations would have to be shown
for all intensities at each wavenumber. Otherwise the significance of the differences between average spectra is impossible
to estimate.
10. In transmission mode the infrared light passes through the
sample, much like the visible light does in standard microscopy. Therefore, this mode is most similar to visible microscopy.
FT-IR Spectroscopic Techniques
339
However, standard microscopy glass slides cannot be used for
infrared microscopy as they absorb infrared light. Instead,
samples should be mounted on infrared transparent windows
(e.g., BaF2, CaF2, ZnSe, NaCl) (Subheading 3.3.1). In addition, the refraction caused by the window results in a focal
shift of the infrared light [16]. This focus shift should be
compensated for by a condenser (Subheading 3.3.2).
11. In reflection mode, the sample is mounted on a carrier with
highly reflective surface (most commonly gold or aluminum
mirrors) (Subheading 3.3.1). The infrared light first passes
through the sample before it is reflected by the mirror and
passes through the sample in the reverse direction before
reaching the detector. Since the light passes through the
sample twice, sample thickness (concentration) is doubled.
12. In ATR mode the sample is seen and measured via an ATR
objective, instead of the standard objective of the infrared
microscope. As such, the measurements are essentially
surface-specific ATR measurements (see Subheading 1.2) of
selected sample areas.
13. Too small of an aperture size results in spectral distortions
and decreased signal-to-noise ratio due to diffraction and
limitation in light intensity. The 50 µm × 50 µm area is given
as a safe limit using an average infrared microscope setup
with a conventional source and a 10–20 µm thick sample section of average quality. However, the exact size of the smallest applicable aperture is determined by the physical,
chemical, and optical properties of the sample as well as by
the infrared source. With a synchrotron source, high-intensity
infrared radiation can be focused on very small areas, providing the best available spatial resolution of all infrared microspectroscopic techniques (in the µm range).
14. Arrays can be square (16 × 16, 32 × 32, 64 × 64 or 128 × 128)
or linear (1 × 16, 1 × 32, etc.). The size of the array determines the number of spectra recorded simultaneously (4,096
for a 64 × 64 array) and also the area that can be recorded in
a single image, since the size of an individual detector element is fixed.
15. Older generation FPA detectors had larger detector element
sizes and consequently larger view fields. Typically these
detectors had an individual element size of ca. 4.5 µm × 4.5 µm,
resulting in a view field of 285 µm × 285 µm for a 64 × 64
FPA. However, even for this detector element size the spatial
resolution is diffraction limited (see Note 16) and not detector element size limited.
16. The spatial resolution (Δx) is diffraction limited [8] and can
be calculated by the formula: Δx ≥ 0.61λ/NA, where λ means
the wavelength of the light and NA is the numerical aperture
340
András Gorzsás and Björn Sundberg
of the objective. The useful spectral range using an FPA
detector on plant samples is ca. 2,000–950 cm−1. This gives a
spatial resolution of ca. 10–20 µm, assuming NA = 0.3 for a
20× Cassegrain-type objective. In practical terms, it means
that the spatial resolution is ca. 13 µm for the aromatic
–C = C– vibration (lignins, monolignols) located at
1,600 cm−1, but it is only ca. 20 µm for the carbohydrate
bands (from cellulose, hemicelluloses, pectins, starch, etc.)
located between 1,000 and 1,100 cm−1.
17. Originally SE detectors had the advantage of being considerably faster than FPA detectors. However, with the arrival of
the latest generation of FPA detectors with updated electronics (including data communication channels), this is no longer the case.
18. The scanner velocity for FPA measurements lies in the range
where the frequency of everyday vibrations (such as from
walking, printing, computers) can easily disturb and create
resonance patterns (fringes) in the spectra. Thus, it is important to provide a vibration-free environment for FPA measurements by using a vibration-proof table in a nonresonant
room (e.g., basement instead of high floors).
19. Software problems can mean any process that takes priority
on the PC, ranging from automatic updates of the operative
system to triggered scans of antivirus software. These take
resources from the PC, which interrupts data streaming from
the FPA detector. In addition, due to the complexity of the
FPA detector and the extreme high flow of data, freezes and
bugs are more frequent than for SE measurements.
20. A generic FT-IR spectrometer with low space and maintenance requirements can be purchased at a relatively low cost.
For high-quality data, however, a vacuum-bench in a
thermostated and vibration-proof environment is recommended. The basic setup for microspectroscopy includes a
microscopy accessory equipped with a single element (SE)
detector and knife-edge apertures. This generally allows for a
single spectrum to be collected from an area of at least
50 × 50 µm with traditional infrared sources. A focal plane
array detector (FPA) increases the cost (purchase, running,
maintenance, and service) and the computing power required
but will provide the highest (diffraction limited, see Note 16)
spatial resolution available with traditional sources at the highest speed by recording thousands of spectra simultaneously.
21. The ball mill should be able to operate at 30 Hz and contain
a minimum of 50 mg sample at a time. If milling is done with
KBr (Subheading 3.1.1), then 2.5 g capacity is needed.
FT-IR Spectroscopic Techniques
341
22. Must be made of agate and not ceramic. Pure agate is
nonabsorbing in the infrared region, resistant to wear and
tear, and nonreactive.
23. Standard microscopy slides can be used to store large number of sections. However, sections cannot be measured on
glass slides and therefore they will have to be transferred after
drying to infrared transparent windows (see Note 24) or mirrors (see Note 25) for measurements. The transfer often
damages the sections and can make them difficult to flatten.
24. Infrared transparent windows are available in different size,
shape, thickness, and materials. NaCl windows are cheap but
very sensitive to moisture and are not recommended. For
Arabidopsis sections, the best materials are BaF2, CaF2, and
ZnSe. BaF2 and CaF2 are colorless and only very weakly
soluble in cold water, whereas ZnSe is orange colored and
practically insoluble (except in acids). They all are brittle and
easy to scratch and therefore must be handled with care.
They have different infrared transparencies with the following cutoff edges at 50 % transmittance (i.e., they only let
through less than half of the infrared light below these limits
and therefore should not be used in that spectral range):
BaF2 ca. 850 cm−1, CaF2 ca. 1,050 cm−1, ZnSe ca. 550 cm−1.
These windows are reusable but may be too expensive for
sample storage.
25. Much like infrared transparent windows (see Note 24), carrier mirrors are available in different sizes, shapes, and materials. The best options for Arabidopsis sections are gold,
silver, and aluminum mirrors, which all provide practically
100 % reflectance in the entire spectral range. They are reusable, but care must be taken to clean them with only soft
cotton pads as they are easily scratched. Gold is the least reactive but also the most expensive mirror.
26. It is critical that drying and powderizing are performed in
the same way for all samples, otherwise spectral differences
between samples will occur as a consequence of different
sample preparation and not necessarily originate from chemical differences. Ball milling affects particle size, which in turn
affects optical and thus spectral properties. In addition, it can
also affect the degree of polymerization [14, 15]. Ball milling
also generates heat and thus may burn the sample. Thus, the
optimal time and frequency for ball milling may need to be
fine-tuned for different sample types. However, when the
samples have widely different physical and chemical properties, even standardized ball milling is unable to neutralize all
differences.
342
András Gorzsás and Björn Sundberg
27. Other, nonreactive and nontoxic IR transparent diluters can
also be used. KBr is the most common because the optical
components in the spectrometers are often also made of KBr.
Thus, KBr mixed with the sample will not impose any additional restrictions. However, only KBr that is specifically
labelled as “Infrared spectroscopy grade” should be used.
Other types may contain very small quantities of IR active
contamination (e.g., nitrate). Note that dry KBr is hygroscopic, so care must be taken to avoid humidity and to keep
all equipment dry.
28. Dilution with KBr can vary depending on the amount and IR
properties of the sample, but it should be very similar for all
samples within an experiment. Ideally the resulting mixtures
should have band intensities between 0.3 and 0.8 Abs units.
29. If measurement conditions are stable, background measurements are not required before every sample. Usually it is sufficient to record the background once or twice per day,
typically at start and after longer breaks. OPUS automatically
uses the last recorded background unless parameters (like
spectral range, spectral resolution) have been changed, making the background and sample measurements incompatible.
30. On newer systems, there is no real advantage in giving the
number of scans as the exponentials of two, but there is no
harm in doing so. The number of scans can be increased to
gain higher signal-to-noise ratios. However, the signal-tonoise ratio only increases with the square root of the number
of scans.
31. Increasing the spectral resolution (to 2 or 1 cm−1, instead of
4 cm−1) is only important when narrow bands or small positional shifts need to be determined precisely. This is rarely the
case for plant materials because band widths are in the region
of tens of cm−1. Increasing spectral resolution results in longer measurement times.
32. Excessively long measurement times should be avoided
because background fluctuations and other disturbances can
occur during measurement. This is particularly important
when less stable purge benches are used.
33. Typically only the 400–2,000 cm−1 region of the spectrum is
used, as the broad OH band obscures most features around
3,000 cm−1 and makes standardization difficult by introducing a large integral value with high uncertainties in the total
sum. This region can, however, contain valuable information, and it can therefore be advantageous to record the spectra in this range too.
34. The interferogram data blocks are small and thus do not
increase the spectrum file size considerably. However, they
FT-IR Spectroscopic Techniques
343
are valuable if Fourier transformation with different
parameters (or manual phasing) is required [18].
35. When a zero filling factor of 2 is applied with a 4 cm−1 spectral resolution, the resulting spectrum will list absorbance
values at every 2 cm−1 (4 cm−1/2). However, this is only the
result of the zero filling factor and in reality the spectral resolution remains 4 cm−1.
36. This protocol describes the measurement of individual samples. However, there are DRIFTS accessories available that
enable sample automation for virtually all spectrometer types.
These can be in the form of carousels or well plates and often
come with their own bundled software. For automated measurements of samples, refer to the software manual of your
sample automation accessory, and provide the measurement
parameters for each sample as outlined in Subheading 3.1.2.
37. It is necessary to simultaneously standardize all spectra to
ensure that they are treated in the exact same way.
38. For non-OPUS users, a high-order polynomial baseline correction should be applied whenever a linear baseline cannot
be used.
39. The only way to obtain quantitative information is by using a
nonnative internal standard, with precisely known concentrations. This means that a nonreactive compound is added to
each sample in a precise quantity. This compound should
produce a distinct and well-resolved band that is used for
normalization and calibration. Currently, there is no compound available that would meet the criteria of a general
internal standard for plant samples.
40. Vector normalization uses the sum of squares, while area
normalization uses the sums as constant. This means that in
vector normalization larger bands will have a higher weight.
This is ideal for suppressing the contribution of noise but
also disfavors small bands.
41. The reference band is often a distinct band of a compound to
which everything will be related, i.e., the observed changes
will be relative to this band. Min–Max normalization is not
disturbed by band position shifts as long as the shifted band
and the baseline region still remain in the frequency range
used for the normalization (see Note 42). However, changes
of band widths can introduce errors, since the normalization
is based on band height instead of band area.
42. The frequency range should be chosen so that it contains the
peak of the band to which the referencing is done and a baseline point where there are no bands. It is crucial that the
frequency range does not contain any bands that are of higher
intensity than the reference band.
344
András Gorzsás and Björn Sundberg
43. Data point table (.dpt) files are more convenient than Jcamp
(.dx) files, because they can be opened in any standard text
editing software (Notepad on Windows or TextEdit on Mac)
and copied—pasted from there. However, .dpt files are more
sensitive to international settings, most notably to decimal
dots vs decimal commas. Files exported to Jcamp (.dx) format are less prone to such errors.
44. In version 7, OPUS also offers the option to save files in
Matlab (.mat) format.
45. OPUS allows the creation of macros to automate tasks, such
as multiple exporting. However, the creation of macros is
beyond the scope of the presented protocols.
46. The size and shape of the IRE depends on the ATR accessory
used. The most common materials are ZnSe, Ge, and diamond. The diamond crystal is the hardest and allows for the
highest applied pressure. It is also the most resistant to
mechanical wear and chemicals and allows the entire spectral
range to be used. Ge and ZnSe IREs impose cutoff edges,
but these are usually outside the spectral region used for the
analysis of plants.
47. Since it is possible to specify different files for background
subtraction in OPUS, it is often a good strategy to record
both the empty IRE as well as the sample solvent.
48. Depositing a solution/suspension onto the IRE and evaporating the solvent is also a good strategy when bands of the
solvent interfere. If that is the case, a change of solvent (e.g.,
H2O to D2O) is also an option. Although evaporation under
certain conditions can induce changes in protein structure, it
has been demonstrated that proteins normally retain enough
solvent molecules to keep their solution structures relatively
intact [19].
49. Section thickness should be halved for reflection mode measurements as compared to transmission mode measurements,
because the infrared light passes through the sample twice in
reflection mode.
50. Using a cryomicrotome has the advantage that the material
can be stored, and the method is easy and do not require
tedious embedding. The plant material is attached directly
on the sample holder with O.C.T. compound and the sample is trimmed with a razorblade to remove as much of the
mounting media as possible. Sectioning is preferentially
done at −20 °C with well-sharpened steel knives. During sectioning, the sections can be directly collected on an object
glass with the help of a brush. To remove excess mounting
media, the sections can be carefully rinsed with water directly
on the object glass.
FT-IR Spectroscopic Techniques
345
51. For vibratome sectioning, samples are molded in agarose
(3–8 %) in Eppendorf tubes. The agarose plug is removed
from the tube and glued on the sample holder with cyanoacrylate glue. To collect the section from the water bath to
the object glass, a plastic Pasteur pipette can be used, where
the tip has been cut to make the opening appropriate to
sample size.
52. Samples embedded in paraffin and sectioned in a microtome
may provide well-preserved sections, but this procedure is
more time-consuming. Moreover, care must be taken, not to
smear paraffin over the sample during sectioning. In addition, the spectrum of the paraffin used for embedding should
be recorded separately, and the recorded sample spectra
always compared to this reference to make sure no traces of
paraffin are interfering with the analysis. Ideally, no
embedding should be used.
53. Creating a section that is thin enough for spectroscopy can
be challenging in the case of Arabidopsis, as those sections
become very fragile and tear, fold, or otherwise lose shape,
making anatomical features unrecognizable. To limit such
damage, paraffin embedding can be used. Care must be
taken, however, not to smear paraffin over the sample during
sectioning. In addition, the spectrum of the paraffin used for
embedding should be recorded separately, and the recorded
sample spectra always compared to this to make sure no
traces of paraffin are interfering with the analysis. Ideally, no
embedding should be used.
54. Always mount several sections on the same carrier to be able
to select the best one for measurements. In addition, consecutive sections can be saved for staining and inspection
under light microscopy. However, never stain the sections
that are to be measured by FT-IR microspectroscopy, as the
dye(s) will appear in the spectrum.
55. The exact time required for drying depends on the sample,
water content, and desiccator capacity, but there should be
no water vapor detected in the spectrum at measurement.
56. In addition to the background position, OPUS has four
sample positions with predefined names: Load Position 1 and
2 and Special Position 1 and 2. All four are equivalent, meaning that four different sample positions can be predefined,
and OPUS keeps them until they are overwritten or until
OPUS is shut down. This is very useful as positions can be
defined in one measurement and found again in subsequent
measurements (see Note 65).
57. This is similar to the way FPA data files are built up
(Subheading 3.4.3). However, SE data blocks are numbered
346
András Gorzsás and Björn Sundberg
in the order they were marked and measured, while FPA data
blocks are numbered by the pixel number in the FPA image.
58. For optimal results, the sample tray should be boxed in,
although by default it might be an open design. A boxed
sample tray limits fluctuations in H2O and CO2 levels and
allows purging with dried instrument air or N2.
59. Start with a low number of scans and only increase it if necessary (i.e., low signal-to-noise ratio). Usually the problem in
microscopy is too high sample intensities (because of too
thick sections), and not the opposite. Increasing the spectral
resolution (to 2 or 1 cm−1, instead of 4 cm−1) is only important when narrow bands or small positional shifts need to be
determined precisely. This is usually not the case for plant
materials where band widths are in the range of tens of cm−1.
Increasing spectral resolution results in longer measurement
times.
60. If the visible image is solid black with no features, make sure
that (a) the correct mode (transmittance or reflectance) is set
on the microscope accessory, (b) the light intensity is sufficiently high, and (c) the light is directed towards the video
camera and not towards the front LCD display.
61. OPUS keeps the defined background positions until a new
one is defined or until OPUS is shut down.
62. Snapshots can be taken independent of the measurements
and afterwards attached to a file by selecting “Attach Video
Image” in the “Edit” menu. This is not always straightforward, however, and taking snapshots during the measurements is much preferred.
63. Several snapshots can be taken and kept in a single file (i.e.,
an overview image, see Note 64, or “before” and “after”
images). All images are numbered consecutively and can later
be numerically accessed.
64. An overview of a larger area can be created by stitching
together several images. To do that, right-click anywhere
inside the green square in the Live Video Pane and select
“Video Image…” then “Set + Scan Overview Image Area” in
the opening contextual menu. Move to the bottom left corner of the area to be overviewed, click on the “Set Area”
button in the open dialog window. Move to the top right
corner of the area to be overviewed, click on the “Set Area”
then on the “Overview Area Now Defined” buttons. The
tray will start moving from the bottom left corner to the top
right corner (the area can only be a rectangle), taking a snapshot at each position and stitching these individual pictures
together into one large overview image. The overview image
is displayed on the right “Still Image Pane” in the measure-
FT-IR Spectroscopic Techniques
347
ment workspace and can be used for quickly moving to
positions: right-click on it, select “Mouse mode…” and
“Move to position.” The cursor changes and left-clicking
anywhere on the overview image will move the tray to that
position. To stop quick movement, right-click on the overview image, select “Mouse mode…” and “No action” in the
contextual menu. Similarly, distances can be measured in the
overview image: right-click, select “Mouse mode…” and
“Measure distances.” As the cursor changes, left-click at one
position, hold down the left mouse button and move to
another position. A straight line will be created from the initial position to the new position, with the distance displayed
in µm. To exit the distance measurement mode, right-click
on the overview image, select “Mouse mode…” and “No
action” in the contextual menu.
65. Moving to any of the defined positions can be done by rightclicking anywhere inside the green square in the Live Video
Pane and selecting “Moving to Defined Positions…” and
then the position name (Background Position, Load Position
1 and 2, Special Position 1 and 2).
66. The software control window for the microscopy accessory
in theory allows changing between visible and infrared light,
moving the tray, adjusting light intensity, and changing
between transmission and reflectance modes. However, it is
less reliable than the direct hardware control and often reacts
sluggishly. Therefore, it is recommended to change these
parameters on the direct hardware control of the microscopy
accessory.
67. There is a direct “Collect Background at Background
Position” command, which also involves the automatic
movement of the tray to the predefined background position
and the background measurement. While this is convenient,
sometimes the xy stage control fails and the tray does not
move. It is therefore advisable to first move to the desired
position while using visible light to make sure that the tray
reacts, then change to infrared light and finally start the
measurement.
68. Changing to visible light after the measurement is not critical
but good practice. It enables the user to determine if there
have been changes to the sample during measurement (shifting out of focus, curling, etc.) and makes it more convenient
to start up the next measurement.
69. The file is only saved when the measurement workspace is
closed. If OPUS is closed, it is lost.
70. Instead of manually marking individual positions, linear,
rectangular, and elliptical grids can also be created automatically
348
András Gorzsás and Björn Sundberg
(right-click: “Measurement Spots/Grid…” and “Define
Linear/Rectangular/Elliptical Grid” options).
71. To stop marking positions, right-click in the Live Video Pane
and select “Mouse mode…” then “No action.”
72. The term “extraction” is somewhat misleading because the
extracted spectra are not removed from the original 3D file
but copied and saved in a new file.
73. Several spectra can be extracted at the same time with incrementing filenames (i.e., spectrum1.0, spectrum2.0, spectrum3.0) or extensions (spectrum1.0, spectrum1.1,
spectrum1.2) by selecting the appropriate option in the dialog window. Generally, it is best to use incremented filenames
and not extensions.
74. “The series of single blocks” option ensures that each
extracted spectrum is stored as an individual single spectrum
file. Another option for 3D files containing several spectra is
the “Average block,” which means that all extracted spectra
will be first averaged, and only the resulting average spectrum is stored in a file, not the individual spectra. There is no
good reason to choose this averaging option, and it is not
recommended even if average spectra are needed. Averages
are best created afterwards, via the “Manipulate” menu and
“Averaging” option, to make sure that no outlier and/or bad
quality spectrum is included. Moreover, it is always recommended to keep the individual spectra for statistical reasons
and multivariate analysis.
75. Loading the spectra is recommended to make sure they are all
of acceptable quality before proceeding with standardization.
76. As opposed to the SE detector, the FPA detector has no
direct temperature indicator light. This is an unfortunate
oversight, as any attempts to start an FPA measurement while
the detector is not cold enough result in an error message,
and OPUS is likely to freeze and require a complete PC shutdown and restart. Generally, after the detector is slightly
overfilled (i.e., a small amount of liquid N2 spills over), it
requires 10–20 min before it reaches operational temperatures (usually below 87 K). For an exact temperature readout, see step 12 in Subheading 3.4.2.
77. Even though some carriers would allow for lower wavenumber limits (see step 7 in Subheading 3.3.2), the FPA detector
itself has a cutoff at ca. 900 cm−1.
78. FPA files can easily exceed 200 MB, depending on spectral
range, resolution, number of scans, and data blocks saved.
If data storage is not a limitation, it is recommended to save
all data blocks in case troubleshooting or error backtracking
is required.
FT-IR Spectroscopic Techniques
349
79. We do not list guideline numbers here, since they can vary
enormously. An older generation FPA could use a scanner
velocity of 168 Hz, whereas a new generation FPA can use
several kHz scanner velocities.
80. In contrast to SE measurements (Subheading 3.3.2), there is
no point using apertures at all.
81. The window that opens is the same as the one accessed from
the “Check Signal” tab of the measurement setup (step 12 in
Subheading 3.4.2).
82. The FPA manufacturers suggest to keep the gain as low as
possible. On the other hand, in our experience with a new
generation FPA in a Bruker Hyperion 3000 system, high
gains resulted in better quality spectra. Although we cannot
confirm whether this is a general rule or an exception, it
makes it worthwhile to test several gain settings on the same
sample and compare the results in order to determine the
optimal gain value.
83. All pixels are individuals and will have their own readouts,
some more intense than others. However, bad pixels are the
ones that produce erroneous readouts and differ significantly
from the rest. They can be detected in the Live FPA Image
Pane as the pixels that do not change color (i.e., not changing intensity) in response to changes, such as moving the
sample and changing the condenser. These should be marked
and compensated for by right-clicking on the Live FPA
Image Pane and selecting the “Bad Pixel…” option for marking, saving the list of bad pixels and choosing “Correct Bad
Pixels.” Correction is made by automatically replacing the
readout of the bad pixel by the average of the readouts of the
pixels immediately surrounding it.
84. The condenser has no function in reflection mode.
85. There is an “Optimize” button in this window, which should
initiate an automated process to find the best offset and gain
settings. However, it does not always work, and manually
setting offset and gain values (and keeping a record of the
settings) is recommended.
86. Due to the extreme flow of data from the FPA, the computer
is unable to show the progress of the measurement (i.e., no
scan number counts). Unfortunately, even when the measurement is finished, OPUS still displays a green status bar as
if the measurement was still in progress. The best indicator of
status is therefore the Live Video Image Pane. While it is
blank dark blue, the measurement is still ongoing. When it
has turned back to black with a green square and crosshair
marking the FPA size and position, the measurement is finished. It is important not to do anything on the PC during
350
András Gorzsás and Björn Sundberg
measurements (no copying of files, etc.), as any activity can
interrupt data transfer from the FPA to the PC.
87. Pixels are numbered consecutively, from left to right and
bottom to top, row-wise, starting from number 0, not from
number 1.
88. The defined integration method produces heat maps that are
crude and generic and as such should not be considered as
accurate chemical images. Although the right and left edges
can be fine-tuned for each defined integral to produce more
accurate maps, it is not the purpose here, and this is why
neither baseline correction nor normalization had to be performed prior to this integration.
89. From version 7, OPUS automatically performs a dummy
integration and opens up a new type of view called “Chemical
Imaging” when a 3D file is opened. Therefore, steps 1–5 are
not needed for OPUS 7 users.
90. When using OPUS version 7, the error message “An invalid
argument was encountered” can pop up. It is a harmless bug
that can safely be ignored.
91. The “Select trace” drop-down list in the Mapping tab should
contain all the Labels given to bands in step 3 in
Subheading 3.4.3, i.e., “lignin 1595,” “lignin 1510,” “-C = O
1740,” and “carbohydrates.” Choose the label that gives the
chemical image (heat map) with the most details and features. Often, this will be the –C = O band at 1,740 cm−1
because it is usually intense, nonoverlapping, and is present
in virtually all tissues. In addition, the –C = O band is mapped
with the highest spatial resolution because it is in the higher
wavenumber end of the spectral region (see Note 16).
92. The actual numbers in the “Contours” drop-down menu
may not be up to date and may refer to the levels of the
intensity of a different band. Therefore, this setting may need
to be revisited after clicking “Apply” and changing to a different tab within the same dialog window, which should
update its values.
93. There are many different view options for chemical imaging
(heat maps), which can be used according to personal preference. It is most important to create a heat map that is detailed
and can easily be correlated to visible features in the section
to allow exact positioning and orientation within the image.
94. OPUS remembers these view settings for the following
“TRC” blocks, except if it quits unexpectedly. Therefore, to
avoid resetting these parameters, do not close the
Map + Vid + Spec 3D view. Instead, just unload the file (rightclick on the file name and select “Unload File”).
FT-IR Spectroscopic Techniques
351
95. Microsoft Windows and video card driver settings can cause
the “No contour” option to return a blank white image, with
no visible snapshot shown. Make sure the “Video Image”
option is selected in the “3D Properties” tab, and the correct
snapshot number is selected in the “Select image” dropdown list in the “Mapping” tab. If a blank Image pane
remains, try to swap the left and right Image panes: displaying “No contours” in the right pane and the chemical image
(heat map) “Contour lines and colors” in the left pane. If still
unsuccessful, update the video card driver.
96. Unfortunately, “click and point” moving of the cursor is only
available in the Chemical Imaging view of OPUS 7. For earlier OPUS versions, only stepwise moving is possible, and
even that can cause frequent crashes that result in a loss of all
view settings in OPUS (see Note 94). For stepwise moving,
use the X and Z controls in the Spectrum pane. However, the
X and Z controls can only have values that are below the X′
and Z′ control values. Therefore, if the crosshair does not
move further in the X or Z direction, change the X′ or Z′
values to their maxima.
97. “The series of single blocks” is the only meaningful choice
here (see Note 74).
98. As opposed to the single element detector (see Note 75,
Subheading 3.3.3), and to Method 2 of Subheading 3.4.3,
loading of all spectra is not recommended because of their
large number in an FPA image. This is why the quality of the
spectra must be checked and bad pixels excluded in step 8 of
Method 1 in Subheading 3.4.3.
99. Version 7 of OPUS contains another major convenience
factor in addition to its “click and point” feature for moving
(see Note 96): the pixels marked during “click and point” are
all loaded in the “Spectra” tab of the Spectrum pane and
listed in the “List” tab. To extract these spectra, select all of
them, right-click and choose “Extract Spectrum….” In the
dialog box that opens the names of the spectra can be constructed from placeholder blocks, such as filename and index
(pixel number). This way, spectra are automatically named
containing their pixel numbers without the need to manually
type the names. Thus, steps 15–19 of Method 2 in
Subheading 3.4.3 are not needed for OPUS 7 users.
100. Since only the first block is extracted, it does not matter
whether “Block” or “End of file” is specified in the
“Extraction Range” because those values are ignored in this
case.
352
András Gorzsás and Björn Sundberg
Acknowledgements
The authors thank Dr. John Loring and Dr. Janice Kenney for
comments and discussions and Kjell Olofsson for assistance in
sectioning. The protocols were developed and tested using the
instruments of the Vibrational Spectroscopy Platform of the
Chemical Biological Centre, Umeå University and Swedish
University of Agricultural Sciences, Umeå, Sweden.
References
1. Zhou GW, Taylor G, Polle A (2011) FTIR-ATR
based prediction and modelling of lignin and
energy contents reveals independent intraspecific variation of these traits in bioenergy
poplars. Plant Methods 7:9
2. Fackler K et al (2011) FT-IR imaging microscopy to localise and characterise simultaneous
and selective white-rot decay within spruce
wood cells. Holzforschung 65:411–420
3. Stevanic JS, Salmén L (2009) Orientation of
the wood polymers in the cell wall of spruce
wood fibres. Holzforschung 63:497–503
4. Rana R et al (2008) FTIR spectroscopy in
combination with principal component analysis or cluster analysis as a tool to distinguish
beech (Fagus sylvatica L.) trees grown at
different sites. Holzforschung 62:530–538
5. Dokken KM, Davis LC, Marinkovic NS (2005)
Use of infrared microspectroscopy in plant
growth and development. Appl Spectrosc Rev
40:301–326
6. Wetzel DL (2009) FT-IR microspectroscopic
imaging of plant material. In: Salzer R, Siesler
HW (eds) Infrared and raman spectroscopic
imaging. Wiley-VCH Verlag GmbH & Co.
KGaA, Weinheim, pp 225–258
7. Gorzsás A et al (2011) Cell-specific chemotyping and multivariate imaging by combined
FT-IR microspectroscopy and orthogonal
projections to latent structures (OPLS) analysis reveals the chemical landscape of secondary
xylem. Plant J 66:903–914
8. Lasch P, Naumann D (2006) Spatial resolution
in infrared microspectroscopic imaging of tissues.
Biochim Biophys Acta 1758:814–829
9. Åkerholm M, Hinterstoisser B, Salmén L (2004)
Characterization of the crystalline structure of
cellulose using static and dynamic FT-IR spectroscopy. Carbohyd Res 339:569–578
10. Noda I, Ozaki Y (2004) Two-dimensional
correlation spectroscopy. Applications in
vibrational and optical spectroscopy. Wiley,
Chichester
11. Socrates G (2001) Infrared and Raman characteristic group frequencies. Tables and charts,
3rd edn. Wiley, Chichester
12. Trygg J et al (2006) Chemometrics in metabolomics. Springer, Berlin
13. Trygg J, Wold S (2002) Orthogonal projections to latent structures (O-PLS). J Chemometr
16:119–128
14. Chalmers JM (2001) Mid-infrared spectroscopy of the condensed phase. In: Chalmers
JM, Griffiths PR (eds) Theory and instrumentation, vol 1, Handbook of vibrational spectroscopy. Wiley, Chichester
15. Schwanninger M et al (2004) Effects of shorttime vibratory ball milling on the shape of FT-IR
spectra of wood and cellulose. Vib Spectrosc
36:23–40
16. Sommer AJ (2001) Mid-infrared transmission
microspectroscopy. In: Chalmers JM, Griffiths
PR (eds) Sampling techniques for vibrational
spectroscopy, vol 2, Handbook of vibrational
spectroscopy. Wiley, Chichester
17. Faix O (1991) Classification of lignins from
different botanical origins by FT-IR spectroscopy. Holzforschung 45:21–28
18. Romeo M, Diem M (2005) Correction of
dispersive line shape artifact observed in diffuse
reflection infrared spectroscopy and absorption/
reflection (transflection) infrared microspectroscopy. Vib Spectrosc 38: 129–132
19. Oberg KA, Fink AL (1998) A new attenuated total reflectance Fourier transform
infrared spectroscopy method for the study
of proteins in solution. Anal Biochem
256:92–106
Chapter 19
A Pipeline for 15N Metabolic Labeling
and Phosphoproteome Analysis in Arabidopsis thaliana
Benjamin B. Minkoff, Heather L. Burch, and Michael R. Sussman
Abstract
Within the past two decades, the biological application of mass spectrometric technology has seen great
advances in terms of innovations in hardware, software, and reagents. Concurrently, the burgeoning field of
proteomics has followed closely (Yates et al., Annu Rev Biomed Eng 11:49–79, 2009)—and with it, importantly, the ability to globally assay altered levels of posttranslational modifications in response to a variety of
stimuli. Though many posttranslational modifications have been described, a major focus of these efforts
has been protein-level phosphorylation of serine, threonine, and tyrosine residues (Schreiber et al.,
Proteomics 8:4416–4432, 2008). The desire to examine changes across signal transduction cascades and
networks in their entirety using a single mass spectrometric analysis accounts for this push—namely, preservation and enrichment of the transient yet informative phosphoryl side group. Analyzing global changes in
phosphorylation allows inferences surrounding cascades/networks as a whole to be made. Towards this
same end, much work has explored ways to permit quantitation and combine experimental samples such
that more than one replicate or experimental condition can be identically processed and analyzed, cutting
down on experimental and instrument variability, in addition to instrument run time. One such technique
that has emerged is metabolic labeling (Gouw et al., Mol Cell Proteomics 9:11–24, 2010), wherein biological samples are labeled in living cells with nonradioactive heavy isotopes such as 15N or 13C. Since metabolic
labeling in living organisms allows one to combine the material to be processed at the earliest possible step,
before the tissue is homogenized, it provides a unique and excellent method for comparing experimental
samples in a high-throughput, reproducible fashion with minimal technical variability. This chapter describes
a pipeline used for labeling living Arabidopsis thaliana plants with nitrogen-15 (15N) and how this can be
used, in conjunction with a technique for enrichment of phosphorylated peptides (phosphopeptides), to
determine changes in A. thaliana’s phosphoproteome on an untargeted, global scale.
Key words Phosphorylation, Metabolic labeling, Stable isotope labeling, Phosphoproteomics, Mass
spectrometry
1
Introduction
Two important methods introduced in this protocol necessary for
quantitative phosphoproteomics are (1) 15N-labeling of Arabidopsis
thaliana and (2) titanium dioxide (TiO2)-based phosphopeptide
enrichment [2]. We assume that the reader is already familiar with
Jose J. Sanchez-Serrano and Julio Salinas (eds.), Arabidopsis Protocols, Methods in Molecular Biology, vol. 1062,
DOI 10.1007/978-1-62703-580-4_19, © Springer Science+Business Media New York 2014
353
354
Benjamin B. Minkoff et al.
the basic theory and use of high-resolution mass spectrometers [1]
(e.g., the Thermo Fisher LTQ Orbitrap or similar instruments)
and the software needed to systematically analyze their output.
The reader is also referred to recent more general reviews on this
subject as well [4].
The concept of metabolic labeling to a very high enrichment
level (to correct for variability in homogenization and protein
extraction) was first coupled with mass spectrometry in 2002 using
deuterated leucine in myoblast and fibroblast cell lines [5]. Since
that time, many forms of metabolic labeling have been described
[3]. Generally, the question asked and model system used dictate
the isotope used for labeling. In A. thaliana, it is clear that the
most cost-effective, logical means of labeling is 15N. In growing up
two sets of plants, one provided solely 15N as a nitrogen source and
the other 14N, the product is a set of plants that have >98 % 15N
incorporation [6] or a natural isotopic distribution of 14N throughout. From a proteomic standpoint, this is key—15N is incorporated
into not only the nitrogenous amino acid residue side chains but
into every amide bond within the peptide backbone. The same
phosphopeptides, from both sets, prepared and processed identically, will have a mass/charge difference directly proportional to
the number of nitrogen atoms they contain. Because of the large
number of nitrogen atoms in each peptide, a given peptide’s isotopic envelope is very complex, and abundance of the peptide is hard
to quantify when the degree of enrichment is below 90–95 %. As
long as this degree of labeling is obtainable, there is readily available software that can separately identify and quantify the amount
of any one peptide that is labeled in vivo with either 15N or 14N. By
shifting the m/z of the 15N-labeled peptide population away from
its non-labeled counterpart, the samples can be combined and the
two peptide populations detected in a single analysis. It is this concept that drives the field of metabolic labeling associated with mass
spectrometry.
Furthermore, from this concept, it follows that changes in
global phosphorylation as a result of treating one of the two sets of
plants (14N or 15N) and mock-treating the other can be assayed
using a high-resolution mass spectrometer. Specifically, comparing
the signal obtained from a 14N-containing phosphopeptide to the
signal from the 15N-containing one gives the observer the relative
degree to which the treatment changes the amount of phosphorylated peptide in A. thaliana (see Fig. 1). Experiments such as these
provide meaningful data that can be used to probe for biological
relevance of phosphorylation events to a specific response. It
should be noted that the reciprocal experiment is done concurrently, i.e., if in the original set of plants, the 14N-containing plants
were treated and the 15N-containing plants were mock-treated, a
concurrent experiment should be done in which 15N-containing
plants are treated and 14N-containing plants are mock-treated. This
15
N Labeling for Phosphoproteomics
355
Fig. 1 Experiment one, on the left, shows the treatment given to the 14N-containing Arabidopsis. A change in a
phosphopeptide’s abundance as a result of treatment, reflected in the isotopic envelopes and extracted ion
chromatograms after combination, homogenization, digestion, enrichment, and analysis, is shown. As seen on
the right side, a reciprocal change is reflected in the reciprocal experiment
is an important experimental control—a relative change in the first
described experiment should be reflected as a reciprocal relative
change in the second (see Fig. 1) [7]. By performing the reciprocal
experiment, artifacts that might arise from the labeling itself can be
detected and excluded. It should also be noted that the relatively
small size of the Arabidopsis seed makes it uniquely suitable for in
vivo labeling. Were the seed larger, the contribution of 14N from
storage protein in the seed would prevent the growing tissue from
becoming labeled to the 90–95 % 15N content needed for accurate
quantitation. Because the seed is so small (c. 12 mg dry weight),
the endogenous 14N is negligible, and after a week’s growth, the
tissue harvested from the seedling is well within the necessary
356
Benjamin B. Minkoff et al.
range to make the experiments possible. For example, the model
legume, Medicago truncatula, has seeds only 30–50 times more
massive than Arabidopsis, but endogenous unlabeled protein complicates the use of metabolic labeling in these plants. There are
ways around this issue, including software-based approaches [6]
and growing tissue culture cells rather than seedlings, but these are
outside the scope of this chapter.
Another problem that limits phosphoproteomic analysis is
obtaining a sample concentrated enough in phosphopeptides that
instrument analysis time is used optimally for detecting and assaying
levels of phosphopeptides, as opposed to unmodified peptides. An
unenriched sample yields a vanishingly small proportion of phosphopeptides (they are eclipsed in multiple ways by highly abundant
peptides, correlated with proteins that always maintain a disproportionately high level of expression)—thus much work has been done
to design and optimize methods for enriching phosphopeptides
from a sample of total protein extract [8]. Multiple techniques and
varieties of such have been described; however, the preferred technique is usually laboratory dependent. The method described
herein uses spherical TiO2 particles packed as a chromatographic
column over which a trypsin-digested protein extract is run. Under
highly acidic conditions (pH ≤ 3), acidic amino acid side-chain
groups are protonated, whereas a phosphoserine or phosphothreonine (pKa < 1.7) remains negatively charged [9]. This remaining
negative charge binds to the associated TiO2, whereas peptides lacking a phosphoryl group wash through the column. Elution of phosphopeptides is performed using a highly basic ammonium hydroxide
solution—the excess OH− ions outcompete bound phosphopeptides, and the TiO2 (pKa/pKb = 4.4/7.7) becomes negatively
charged [10]. The reader should be aware that there are many available means of enriching complex biological samples for phosphopeptides, but TiO2 has been implemented in many laboratories and
is the most prevalent way of doing so currently.
This chapter thus describes the process of metabolically labeling A. thaliana with 15N-containing salts, processing the tissue,
enriching extracted, digested protein for phosphoryl-containing
peptides, analyzing concentrated samples on an Orbitrap-based
mass spectrometer, and post-analysis data processing, guiding the
reader from seeds and media to a file output that is easy to work
with and has had standard QA techniques performed.
2
Materials
2.1 Growing Plant
Material
1. Magenta boxes, GA 7, or 250 mL Erlenmeyer flasks (see Note 1).
2. Wild-type or mutant seeds of A. thaliana, 12 mg of seeds
(about 20 μg/seed, or 600 seeds) per box.
15
N Labeling for Phosphoproteomics
357
3. 10× Murashige and Skoog (MS) Micronutrient Solution.
Store at 4 °C (see Note 2).
4. 1 M calcium chloride (CaCl2).
5. 1 M magnesium sulfate (MgSO4).
6. Monobasic potassium phosphate (KH2PO4).
7. 2-(N-morpholino)ethanesulfonic acid (MES).
8. Sucrose.
9. Ammonium nitrate (NH4NO3), 15N-NH4NO3, 98+ %
(Cambridge Isotope Laboratories) and natural abundance,
14
N-NH4NO3 (Sigma). No special care needed in handling
15
N-stable (nonradioactive) isotopes.
10. Potassium nitrate (KNO3), 15N-KNO3, 99 % (Cambridge Isotope
Laboratories) and natural abundance, 14N-KNO3 (Sigma).
11. 1 M potassium hydroxide (KOH).
12. 18 MΩ ultrapure deionized H2O (see Note 3).
13. 2 L graduated cylinder.
2.2 Sterilizing Seeds:
Liquid Method (See
Note 4)
1. 1.7 mL microcentrifuge tubes.
2. 95 % v/v ethanol.
3. 70 % ethanol, 0.1 % triton X-100, 2 % bleach solution (all in
H2O, v/v).
4. Whatman filter paper.
2.3 Sterilizing Seeds:
Vapor Method (See
Note 4)
1. 250 mL glass beaker.
2. Sealable desiccator.
3. 1.7 mL microcentrifuge tubes.
4. Bleach, 5–10 % sodium hypochlorite (NaOCl).
5. Concentrated hydrochloric acid.
2.4
Plant Growth
1. Orbital shaker with platform.
2. Fluorescent lights above shaker platform, 2,600–3,200 lx light
intensity.
2.5
Tissue Harvest
1. 3.5 in. porcelain mortar and pestle (CoorsTek).
2. Liquid nitrogen.
3. Dry ice.
4. 50 mL disposable centrifuge tubes.
5. Paper towels.
6. Salad spinner (see Note 5).
7. Disposable spatula.
358
Benjamin B. Minkoff et al.
2.6 Full
Homogenization
1. 4 g fresh weight combined, ground tissue: 2 g 14N-containing
tissue, 2 g 15N-containing tissue. Store at −80 °C.
2. 50 mL disposable centrifuge tubes.
3. 50 mL Oak Ridge tubes.
4. Centrifuge capable of housing rotor for 50 mL oak ridge tubes
and 1,500 × g/4C spins.
5. 100 mL tricornered polypropylene beakers.
6. Homogenization buffer: 290 mM sucrose, 250 mM Tris–HCl
(pH 8), 25 mM EDTA, 25 mM sodium fluoride, 50 mM
sodium pyrophosphate, 1 mM ammonium molybdate, 0.5 %
w/v polyvinyl pyrrolidone, in H2O. Store at 4 °C.
7. 5 μg/μL α/β (alpha/beta) casein stock in H2O (see Note 6).
8. 500 mM dithiothreitol in H2O (DTT). Store at −20 °C.
9. Saturated stock of phenylmethylsulfonyl fluoride (PMSF) in
isopropanol. Store at 4 °C.
10. 21.5 mM leupeptin in H2O. Store at −20 °C.
11. 1.5 mM pepstatin in ethanol. Store at −20 °C.
12. 2 mM bestatin in H2O. Store at −20 °C.
13. 50 mM 1, 10 phenanthroline in 95 % ethanol. Store at 4 °C.
14. 100 mM vanadate. Store at −20 °C (see Note 7).
15. 2.8 mM E64 in H2O (Sigma). Store at −20 °C.
16. Sonicator with 1 cm probe (see Note 8).
17. Miracloth (Calbiochem).
2.7 Methanol/
Chloroform Protein
Extraction
1. Methanol.
2. Chloroform.
3. 18 MΩ ultrapure deionized H2O (see Note 3).
4. 50 mL polypropylene disposable centrifuge tubes.
5. 8 M urea in 50 mM ammonium bicarbonate. Store at −20 °C.
6. PhosSTOP tablets (Roche). Store at 4 °C.
7. 80 % v/v acetone.
8. 1.5 mL LoBind microcentrifuge tubes.
9. Tabletop sonicator with 3 mm probe, capable of an output
wattage of ≥3.
2.8 In-Solution
Trypsin Digestion
1. 50 mM ammonium bicarbonate.
2. PhosSTOP tablets (Roche). Store at 4 °C.
3. BCA (bicinchoninic acid) assay kit.
4. 500 mM DTT. Store at −20 °C.
5. 500 mM iodoacetamide in H (IAA). Store at −20 °C.
15
N Labeling for Phosphoproteomics
359
Fig. 2 Packaging, pre- and post-modification syringes, and the Dremel tool used,
from left to right. Also shown is the adapter (1, 3, and 6 mL)
6. Trypsin, lyophilized. Store at −80 °C.
7. 15 mL disposable centrifuge tubes.
2.9 Solid Phase
Extraction/Peptide
Concentration (All
Liquid Solutions
Are v/v)
1. tC18 3 cc Sep-Pak cartridges (Waters). See Note 9.
2. Acetonitrile.
3. 0.1 % trifluoroacetic acid in 18 MΩ ultrapure deionized H2O.
4. 80 % acetonitrile in 18 MΩ ultrapure deionized H2O/0.5 %
formic acid (see Note 10).
5. 1.5 mL LoBind microcentrifuge tubes.
6. Vacuum centrifuge.
7. Glass syringe pipettes (Hamilton).
8. Ring stand/tube clamps.
9. 20 mL Luer-Lok tip syringes with manual modification (see
Fig. 2 and Note 11).
10. Syringe adapter, 1, 3, and 6 mL (Varian) (see Fig. 2 and Note 11).
2.10 Titanium
Dioxide-Based
Phosphopeptide
Enrichment (All Liquid
Solutions Are v/v)
1. 19-gauge machined metal sheath with internal, sliding wire.
2. Empore high-performance extraction discs, C-8, 47 mm
(3M).
3. GeneMate P200 yellow pipette tips (see Note 12).
4. 10 μm titansphere titanium dioxide resin (GL Sciences Inc,
Japan).
5. Methanol.
6. 18 MΩ ultrapure deionized H2O (see Note 3).
360
Benjamin B. Minkoff et al.
Fig. 3 Apparatus and materials are shown in (a), and a mock-enrichment setup
is shown in (b). During enrichment, pressure must be applied downward to the
syringe and column together as well as the syringe plunger itself, in order to
prevent large amounts of pressure from forcing syringe and tubing from tip
7. 300 mg/mL lactic acid in 80 % acetonitrile/0.1 % trifluoroacetic acid, pH ≤ 3.
8. 80 % acetonitrile/0.1 % trifluoroacetic acid.
9. 30 % acetonitrile/0.1 % formic acid.
10. 1 % ammonium hydroxide in 18 MΩ ultrapure deionized
H2O, pH ≥ 10.
11. 1.5 mL LoBind microcentrifuge tubes.
12. Formic acid.
13. Vacuum centrifuge.
14. In-house-constructed apparatus for procedure (see Fig. 3 and
Note 13).
15. 5 mL Luer-Lok tip syringe.
15
N Labeling for Phosphoproteomics
361
16. 5 mm outer diameter latex tubing.
17. Clamp from ring stand.
18. 500 fmol/μL solution angiotensin-II phosphate peptide
(Calbiochem).
2.11 Mass
Spectrometric
Analysis
2.12 Data
Processing: From
Direct Output to
Database Search
(See Note 15)
1. 0.1 % v/v formic acid in 18 MΩ ultrapure deionized H2O.
2. Orbitrap-based mass spectrometer/online liquid chromatography system (see Note 14).
1. Trans-Proteomic Pipeline [11].
2. Mascot Daemon (Matrix Sciences). See Note 16.
3. TAIR Arabidopsis Proteomic Forward/Reverse Database.
4. PC running Windows 2000 or newer.
5. .raw file(s) from Orbitrap.
2.13 Data Processing:
From Database Search
Output to Census
Processing
(See Note 15)
1. Trans-Proteomic Pipeline [11].
2. In-house-developed false discovery rate script (see Notes 15
and 17).
3. Microsoft WordPad or equivalent text editor.
4. Census processing script and viewer, freely available [12]. See
Note 18.
5. Census config file.
6. PC running Windows 2000 or newer.
7. .dat file(s) output from Mascot database search.
2.14 Post-census
Processing
1. In-house-developed TAIR Area Ratio Script (see Notes 15
and 17).
2. Microsoft Excel (or spreadsheet program capable of viewing
and editing .tsv files).
3. census_chro.xml output file(s) from Census analysis.
3
Methods
3.1 Media Preparation
(See Note 19)
1. Prepare 2 L modified MS media: combine 200 mL 10× MS
Micronutrient Solution, 3 mL 1 M CaCl2, 1.5 mL 1 M
MgSO4, 170 mg KH2PO4, 1 g MES, in 2,000 mL graduated
cylinder. Bring volume to 1.8 L using H2O.
2. Split into two 900 mL aliquots (for 1L-14N and 1L-15N, final
volumes).
3. To 14N solution, add 0.825 g natural abundance (14N)
NH4NO3 and 0.96 g natural abundance (14N) KNO3.
362
Benjamin B. Minkoff et al.
4. To 15N solution, add 0.825 g
15
N-KNO3.
15
N-NH4NO3 and 0.96 g
5. Mix thoroughly.
6. Adjust pH to 5.7 using 1 M KOH while stirring.
7. Add 10 g sucrose and stir until fully dissolved.
8. Bring both solutions to 1 L using H2O.
9. Aliquot 75 mL/magenta box.
10. Replace lids on magenta boxes and apply autoclave tape to lid.
11. Autoclave on liquid setting for 45 min.
12. Post-sterilization, remove from autoclave and let cool to room
temperature.
3.2 Seed
Sterilization: Liquid
Method (See Note 4)
1. All steps performed in sterile laminar flow hood, using aseptic
technique.
2. Wet sections of filter paper (1 section/magenta cube to be
grown) using 1 mL 95 % ethanol.
3. Add 12 mg of seeds to a 1.7 mL microcentrifuge tube per
magenta cube to be grown (see Note 20).
4. Add 1 mL 70 % ethanol, 0.1 % triton X-100, 2 % bleach solution to seeds. Mix by inverting.
5. Soak for 5 min, shaking tube(s) at roughly 1 min intervals.
6. Allow seeds to sink to bottom of tube. Remove as much liquid
as possible.
7. Rinse with 1 mL 95 % ethanol, shake, and allow seeds to settle
to bottom. Remove as much liquid as possible.
8. Add an addition 1 mL 95 % ethanol. Suspend seeds in liquid.
9. Pipette seeds and ethanol onto filter paper (see Note 21).
10. Allow to dry in sterile hood.
3.3 Seed
Sterilization: Vapor
Method (See Note 4)
1. Place seeds (no more than 0.5 mL) into 1.7 mL microcentrifuge tube. Many tubes can be done at once to increase
throughput.
2. In a sealable desiccator under a hood, place the rack containing the tubes and a 250 mL beaker containing 100 mL bleach.
3. Attach a piece of tape with sharpie writing onto the test tube
rack (see Note 22).
4. Add 3 mL concentrated HCl to bleach and immediately seal
container.
5. Allow fumes to sterilize seeds 4–16 h.
6. Open desiccator, seal microfuge tubes, and dispose of bleach/
HCl appropriately.
15
N Labeling for Phosphoproteomics
363
Fig. 4 Plants growing on shaker. Plant mass is right for processing—plants were
frozen and ground shortly after picture was taken
3.4
Plant Growth
1. When magenta boxes and media have cooled to room temperature and seeds and filter paper have dried, label boxes and
transfer seeds to magenta boxes (see Note 23).
2. Place boxes on orbital shaker under the following conditions:
approximately 30 rpm, 24 h light, room temperature (~23 °C).
See Note 24 and Fig. 4.
3.5 Treatment
and Tissue Harvest
1. Allow plants to grow sufficiently (10–12 days). See Fig. 5.
2. Administer treatment in reciprocal fashion (see Note 25).
3. Label and prechill 50 mL disposable centrifuge tubes on dry
ice (one for each magenta cube).
4. Carefully remove tissue from box, rinse with H2O, and dry
using preferred method (see Note 5).
5. Transfer plant tissue to prechilled mortar (see Note 26).
6. Pour additional liquid nitrogen into mortar, covering plant
tissue.
7. When most of the liquid nitrogen has evaporated, grind plant
tissue quickly and completely into a fine powder.
8. Transfer plant powder into prechilled, pre-labeled 50 mL disposable centrifuge tube using disposable spatula to scrape
powder from mortar. Keep on dry ice or place immediately in
−80 °C freezer until processing further.
3.6 Full
Homogenization
1. Combine 2 g 14N-labeled tissue with 2 g 15N-labeled tissue
corresponding to 1 reciprocal experiment into 15 mL disposable
364
Benjamin B. Minkoff et al.
Fig. 5 Close-up of plant at the mass ready for processing (~12 days of growth)
centrifuge tube. Keep submerged in dry ice as much as possible
(see Note 27).
2. In 100 mL tricornered polypropylene beakers, aliquot 40 mL
homogenization buffer. Keep on ice throughout duration of
procedure (see Note 28).
3. Prior to addition of tissue, add the following from stock
solutions:
(a) 80 μL 500 mM DTT.
(b) 40 μL saturated PMSF stock.
(c) 40 μL 21.5 mM leupeptin stock.
(d) 40 μL 1.5 mM pepstatin stock.
(e) 20 μL 2 mM bestatin stock.
(f) 80 μL 50 mM 1,10 phenanthroline stock.
(g) 800 μL 100 mM vanadate stock.
(h) 40 μL 2.8 mM E64 stock.
(i) 20 μL 5 μg/μL α/β casein stock.
4. Add pre-weighed, combined tissue to beaker. Stir with pipette
tip to allow minor thawing and homogenous distribution of
powdered tissue.
5. On ice, sonicate using 1 cm probe 10 s, five times.
6. Pour resulting homogenate through one or two layers of miracloth into a polycarbonate Oak Ridge tube (see Note 29).
7. Centrifuge filtrate 15 min at 1,500 × g and 4 °C to remove
debris.
15
N Labeling for Phosphoproteomics
365
8. Collect supernatant in 50 mL disposable centrifuge tube.
Discard pellet.
9. Supernatant can be stored at −80 °C or can be immediately
aliquoted and further processed via methanol/chloroform
extraction.
3.7 Methanol/
Chloroform Protein
Extraction (See Note 30)
1. Using the ~40 mL supernatant from chemical homogenization, separate into 5 mL aliquots in 50 mL polypropylene disposable centrifuge tubes.
2. To each 5 mL aliquot, add in the following order:
(a) 3 parts methanol (15 mL).
(b) 1 part chloroform (5 mL).
(c) 4 parts H2O (20 mL).
3. Centrifuge 10 min at room temperature and 5,000 × g.
4. Remove and discard upper aqueous phase from each tube, taking care not to disturb the protein precipitate located at the
phase interface (see Note 31).
5. Add 4 parts (20 mL) methanol onto interface and lower phase
remaining in each tube.
6. Centrifuge 5 min at room temperature and 1,500 × g.
7. Discard supernatant and wash using 5 mL 80 % acetone.
8. Centrifuge 5 min at room temperature and 1,500 × g (see Note 32).
9. Discard supernatant.
10. Per conical tube, solubilize/denature protein pellet with
300 μL 8 M urea + 1× phosSTOP cocktail (one tablet/10 mL).
As each pellet is solubilized, it can be added to a combined
portion in a 15 mL disposable centrifuge tube.
11. Sonicate on ice using a 3 mm desktop probe, pulsing the mixture lightly until a uniform color is reached.
12. Sample(s) can be frozen and stored at −80 °C or can be immediately digested with trypsin.
3.8 In-Solution
Trypsin Digestion
1. Dilute samples to 1 M urea using 50 mM ammonium bicarbonate + 1× phosStop.
2. Perform BCA assay to determine protein concentration.
3. Aliquot 5 mg protein to a 15 mL disposable centrifuge tube.
4. Using 500 mM DTT, bring solution to 5 mM DTT.
5. Place at 50 °C for 45 min.
6. Using 500 mM IAA, bring solution to 15 mM IAA.
7. Place in dark at room temperature for 45 min.
8. Add trypsin at a ratio of 1:100 trypsin to protein.
366
Benjamin B. Minkoff et al.
9. Place, shaking, between 4 h and overnight at 37 °C.
10. Remove digested mixture from 37 °C incubation.
11. To arrest digestion, bring solution to 0.3 % formic acid. Check
that pH ≤ 3 using indicator strips (see Note 33).
12. Sample(s) can be frozen and stored at −80 °C or can be immediately concentrated.
3.9 Solid Phase
Extraction/Peptide
Concentration (See
Note 34)
1. Set up 3 cc tC18 Waters’ Sep-Pak in a tube clamp attached to
ring stand, using a 50 mL disposable centrifuge tube for waste
collection.
2. Equilibrate cartridge using 4 mL acetonitrile.
3. Wash column with 4 mL 80 % acetonitrile/0.5 % formic acid.
4. Wash column with 6 mL H2O/0.1 % trifluoroacetic acid.
5. Saving the flow through, load sample onto column and, using
described syringes (see Note 10), push through at a rate no
faster than ~1 drop/s.
6. Reload flow through onto column and push through a second
time, no faster than ~1 drop/s.
7. Wash column with 6 mL H2O/0.1 % trifluoroacetic acid.
8. Elute slowly (≤1 drop/s) into a 1.5 mL LoBind microcentrifuge
tube using 1 mL 80 % acetonitrile/0.5 % formic acid.
9. Collect a second elution in a second LoBind microcentrifuge
tube using 500 μL 80 % acetonitrile/0.5 % formic acid and
500 μL acetonitrile.
10. Dry down elutions in vacuum centrifuge until liquid is gone
(see Note 35).
11. Dried pellet/powder can be frozen at −80 °C or immediately
solubilized and enriched for phosphopeptides.
3.10 Titanium
Dioxide-Based
Phosphopeptide
Enrichment
1. Using the 19-gauge machined tubing with internal sliding
wire, punch a single circle of C-8 material from a 3M Empore
extraction disc (see Fig. 6).
2. Using the sliding wire, gently push the material out of the
shaft into the bottom of a GeneMate P200 pipette tip, forming a tight plug.
3. Weigh out titanium dioxide and suspend in 100 μL H2O (see
Note 36).
4. Set up apparatus as pictured (Fig. 3) and pipette suspended
titanium dioxide into tip. Push liquid through, forming a
tight, dry column of material.
5. Resuspend dried pellet/powder from solid phase extraction
(SPE) elutions in 100 μL lactic acid solution (see Note 37).
6. Add 2 μL (1 pmol) phosphorylated angiotensin-II peptide
into resuspended protein solution (see Note 38).
15
N Labeling for Phosphoproteomics
367
Fig. 6 Materials used to make TiO2 column
7. Wash titanium dioxide column with 60 μL methanol.
8. Pass 100 μL lactic acid solution through column. Repeat a
second time.
9. Load sample onto column, using a 1.5 mL LoBind microcentrifuge tube to collect “flow through” fraction.
10. Wash column twice with 100 μL lactic acid solution, collecting
in the same tube as the flow through.
11. Wash column twice with 100 μL 80 % acetonitrile/0.1 % trifluoroacetic acid. Collect in separate tube as “wash.”
12. Elute peptides from column into a third tube (“elution”) by
washing twice with 50 μL 1.0 % ammonium hydroxide
solution.
13. Perform a second elution into the elution tube with 50 μL
30 % acetonitrile/0.1 % formic acid.
14. Add 3.5 μL neat formic acid directly into eluate to acidify
solution.
15. Using vacuum centrifuge, dry down total volume of elution to
~2–3 μL.
16. Dried phosphopeptide pellet/solution can be stored at −80 °C
or immediately solubilized/diluted for Orbitrap analysis.
3.11 Mass
Spectometric Analysis
1. Solubilize phosphopeptide solution/pellet in 0.1 % formic acid
(in 18 MΩ ultrapure deionized OR pure LC/MS (liquid chromatography/mass spectrometry) grade water). See Note 39.
2. Analyze using Orbitrap mass spectrometer. For details of our
separation/data collection conditions, see Notes 40 and 41.
368
Benjamin B. Minkoff et al.
Fig. 7 A folder demonstrating a single experimental replicate throughout data processing named “2x1x.” (a)
shows contents prior to Mascot database searching, (b) folder contents prior to Census processing, and (c)
folder contents following both Census and Area Script processing
3.12 Data
Processing: From
Direct Output to
Database Search
(see Fig. 7)
1. Create a folder (in windows explorer) referring to the experiment/reciprocal treatment performed. Avoid spaces and characters other than letters, numbers, or underscores. This applies
to all directories leading to newly created folder as well.
2. Copy database .fasta file to folder.
3. Move all .raw files (untouched Orbitrap output) into folder.
4. Using the Trans-Proteomic Pipeline, convert .raw files into
.mzXML files:
(a) Log in to the Trans-Proteomic Pipeline and specify the
analysis pipeline to be used as “Mascot.”
(b) Under the header “mzXML Utils,” navigate to “mzXML”
button.
(c) Specify the files to be converted and click “Convert to
mzXML.”
5. Using Trans-Proteomic Pipeline, convert .mzXml files into
.mgf files:
(a) Log in to the Trans-Proteomic Pipeline and specify the
analysis pipeline to be used as “Mascot.”
15
N Labeling for Phosphoproteomics
369
Fig. 8 Parameters used in database search. Variable N/Q deamidation, under “Variable modifications,” is not
shown but also selected
(b) Under the header “mzXML Utils,” navigate to “convert
mzXML files.”
(c) Ensure that mascot generic format, “.mgf,” is the output
file format. Do not modify settings.
(d) Specify files to convert and click “Convert files.”
6. Using Mascot Daemon, perform database search using the
processed file(s). For a suggested set of search parameters, see
Note 42 and Fig. 8.
3.13 Data
Processing: From
Database Search
Output to Census
Processing (see Fig. 7)
1. In order to access completed searches, navigate via windows
explorer to Mascot’s “data” folder.
2. The output files (.dat) will have been given arbitrary names by
the software but will all be contained within a folder specifying
the date on which they were run.
3. Accessing the log file contained within Mascot Daemon’s
interface will show which .dat file corresponds to which database search. Copy the .dat files to folder that now contains .
raw, .mzXML, and .mgf files. Rename using respective file
names, keeping the .dat file extension.
4. Using Trans-Proteomic Pipeline, convert .dat files into .pep.
xml files:
(a) Log in to the Trans-Proteomic Pipeline and specify the
analysis pipeline to be used as “Mascot.”
370
Benjamin B. Minkoff et al.
(b) Under the header “Analysis Pipeline (Mascot),” navigate
to “pepXML.”
(c) Add all .dat files to convert.
(d) Add database used in the Mascot search.
(e) Begin file conversion.
(f) Ensure the .pep.xml output file(s) has identical names to
all previous file types associated with the original .raw file.
5. Create subfolders named for each experimental replicate (.raw
file) within the originally created folder.
6. Move respective .raw, .mzXML, .dat, .mgf, and .pep.xml files
into corresponding subfolders.
7. Copy “config” .xml file, runCensus.bat script, and database
.fasta file into each subfolder individually.
8. Open FDR v5 and set the parameter “Processing Mode” to
either “single file” or “batch mode,” for analyzing one or
more than one .pep.xml file, respectively.
9. Add files to be analyzed and specify the database used in the search.
10. Run FDR v5. For each replicate/subfolder, ensure output files
include a …peptable.tsv file, a …chargesep.tsv file, a …filtered_bycharge.pep.xml file, and a …filtered_bycharge_reformattedmods.pep.xml file. The “…” refers to each file given to
the FDR script for analysis and will reflect the name of the file
submitted. This remains true for each step hereafter.
11. For each of your experimental replicates (each .raw file and
associated downstream files, now all in individual folders), the
following steps must be done.
12. Open the …filtered_bycharge_reformattedmods.pep.xml file
using Microsoft WordPad.
13. This file must be manually modified in the following ways (see
Fig. 9):
(a) In the second line, where the file reads summary_
xml=“c:/…,” delete everything within the quotations
except for the file name and .pep.xml extension.
(b) In the third line, where the file reads msms_run_summary
base_name=“c:/…,” delete the entire path, leaving solely
the name of the file. There should be no file extension
associated with the name.
(c) In the seventh line, where the file reads search_summary
base_name=“c:/…,” delete the entire path, leaving solely
the name of the file and .pep.xml file extension.
(d) Using the find and replace function (Edit -> Replace or
Ctrl + H), find every instance of the word “ionscore” and
replace it with the word “xcorr.” (no period).
15
N Labeling for Phosphoproteomics
371
Fig. 9 Reformatting of the “…_filtered_bycharge_reformattedmods.pep.xml” file from FDR v5 script. Relevant
portions are in bold. (a) is prior to manual changes, (b) is following changes. Not shown is the change replacing
every instance of “ionscore” with “xcorr.” (no period—Subheading 3.13, step 13d)
Fig. 10 RunCensus.bat, opened with WordPad. Relevant portions are in bold. Line breaks have been inserted
after each input command for easier visualization
14. Save this modified file as …filtered_bycharge_reformattedmods_ xcorr.pep.xml to the same folder as the unmodified file.
15. Using Microsoft WordPad, open the runCensus.bat file you
copied into each subfolder. Three input options must be
altered in this script (see Fig. 10):
(a) The first input, a file path, refers to the location of the “…_
filtered_bycharge_reformattedmods_xcorr.pep.xml”
file
modified in step 13 and saved in step 14 and should be
changed to reflect the location and specific name of the file.
(b) The second input, a directory, refers to the directory containing information necessary for processing files using
Census. If the preceding directions have been followed,
simply change the directory to the specific subfolder containing the reformatted mods file specified above.
372
Benjamin B. Minkoff et al.
(c) The third input refers to the config file used by Census
during analysis. This was copied into folder in step 7;
specify the directory that contains it, followed by the filename and extension. This should be the same as the directory specified in step 15b.
16. Overwrite the copied runCensus.bat file with this newly modified, replicate-specific file.
17. Begin Census analysis by navigating to the specific subfolder
and double-clicking on the runCensus.bat.
18. Ensure that Census analysis yields a census_chro.xml file and a
“…”.tgz file.
3.14 Post-census
Processing
1. Open Census software.
2. Open census_chro.xml file using Census’ interface.
3. Export the report as an editable file with File → Export Report.
Ensure “No Filters” is checked. Click Export.
4. Verify that census-out.txt and census-out_singleton.txt files
now exist in subfolder.
5. Open in-house-developed “Census 1.72 Area Script (TAIR).”
6. Using the interface, specify the “census-out.txt” as the
“Census Outfile,” “census-out_singleton.txt” as the “Census
Singleton” file, “…_filtered_bycharge_reformattedmods.pep.
xml” (prior to manual modification) as the “reformatted mods
pep xml file,” and the .fasta database as the database file.
7. Click run.
8. Ensure the software yields a “histogram_all_peptides.gif,”
“histograms_unique_peptides.gif,”
“peptideSummary.tsv,”
and a “peptideSummary_withScores.tsv.” See Note 43.
9. Open and visualize “peptideSummary_withScores.tsv” using
Microsoft Excel or other appropriate spreadsheet software.
4
Notes
1. Plants can be grown in either 250 mL Erlenmeyer flasks
(capped with aluminum foil) or GA7 Magenta boxes—there
are pros and cons to both. Erlenmeyer flasks have a lower incidence of contamination, but during removal, the plant must
be compressed through the top of the flask, potentially inducing mechanosensitive responses.
2. A modified MS solution can be made from scratch [13, 14],
eliminating the need for the 10× Micronutrient Solution;
however, using the 10× solution potentially reduces both variability and preparation time. The modified MS salt solution
15
N Labeling for Phosphoproteomics
373
contains, per liter of water: 6.2 mg boric acid, 166.1 mg
CaCl2, 0.025 mg CoCl2·6H2O, 0.025 mg cupric sulfate·5H2O,
37.26 mg disodium ethylenediaminetetraacetic acid, 27.8 mg
ferrous sulfate·7H2O, 90.35 mg MgSO4, 16.9 mg MnSO4,
0.25 mg Na2MoO4, 0.83 mg KI, 85 mg KPO4 (monobasic),
8.6 mg ZnSO4·7H2O, 0.825 g NH4NO3, 0.96 g KNO3, 0.5 g
MES, and 10 g d-sucrose. If using the modified MS solution
routinely, making stock solutions of the components increases
consistency and efficiency.
3. Though specified as 18 MΩ ultrapure deionized H2O throughout Subheading 2, it is simply referred to as H2O in
Subheading 3.
4. Both seed sterilization methods are used routinely with comparable results. The liquid method can be done quickly and
without use of a fume hood, whereas the vapor method,
though less hands-on, requires a fume hood and longer sterilization time.
5. There are two methods for drying plant material prior to freezing. Removing excess water and media is necessary prior to
freezing and grinding—both will freeze into ice that not only
makes efficient homogenization near impossible but can convolute accurate weighing of sample later in the pipeline. One
method is a quick spin in a kitchen salad spinner. Place the plant
material in spinner and spin quickly for roughly 3 s, allowing the
spinner to continue for ~5 s. Stop the spinner and immediately
freeze plant. The second method is to gently blot the plant
material dry on paper towels. Place plant material on several layers of paper towels and cover with one layer. Gently blot, move
the plant mass to a dry spot on the towel, and repeat. Both of
these methods potentially induce mechanosensitive responses;
however, no better methods have been described at the time of
this publication. The most important aspect is to handle all samples that will be compared in similar fashions.
6. Equal amounts α and β casein should be mixed to attain this concentration. Casein is added at this step as an experimental control
for every step hereafter and should be observed in every sample
analysis post-processing. Casein phosphopeptides have been
added to the provided database (see Note 15) for searching.
7. The recipe used to make 100 mM vanadate:
(a) Make 200 mM solution using sodium orthovanadate in
1 M Tris–HCl.
(b) Boil until colorless.
(c) Mix 1:1 with 10 mM H2O2.
(d) Bring solution to pH 9.5.
(e) Store at −20 °C.
374
Benjamin B. Minkoff et al.
8. The sonicator used in this laboratory is a Heat SystemsUltrasonics, Inc. Sonicator/Cell Disruptor, Model W-375. It
is used at 50 % duty cycle.
9. The Sep-Pak cartridge described here has a peptide capacity of
5 mg. The procedure(s) described in this chapter can be scaled
down in reasonable fashion. For example, Waters sells Sep-Pak
cartridges labeled “1 cc,” which have a peptide capacity of 1 mg.
10. The acid used will vary based on individual laboratory procedures. Formic acid is used in this case—this is because when
running liquid chromatography/mass spectrometry (LC/
MS), the buffers used contain formic acid. This can be substituted with other acids, such as acetic acid. Importantly, the
acid-containing solution with which peptides are eluted and in
which the phospho-enriched pellet is solubilized should be
made using an acid consistent with that used in LC buffers.
11. In order to modify Luer-Lok tip syringes to fit into the specified adapters (which in turn fit into Waters’ Sep-Pak cartridges), a handheld Dremel tool with a cutting bit is used.
Carefully, the threaded portion of the tip is cut off, leaving
solely the ~40 mm internal tip (see Fig. 2). The adapter fits
onto the exposed internal tip.
12. GeneMate brand tips must be used—they taper at the tip and
are thus the only brand that can be plugged sufficiently with
the C-18 disc.
13. The apparatus used for TiO2 enrichment was constructed as
follows (see Fig. 3): The two “legs” are made of three blue,
stacked, 96-well PCR (polymerase chain reaction) tube racks,
and the “bridge” portion is an orange, 96-well PCR tube rack.
At the four corners of the bridge, a 1 mL syringe plunger and
a P200 tip are used to hold the bridge to the legs. Generally,
any apparatus that allows space for a test tube rack and 1.7 mL
LoBind microcentrifuge tubes, while holding up under a
decent amount of pressure, is sufficient.
14. The mass spectrometer used in this protocol is a Thermo LTQ
Orbitrap XL. The LC system is composed of all Agilent 1100
hardware. It contains a Nanopump, an isocratic pump, a columnswitching valve, a micro-well plate autosampler (held at 4 °C),
and a degasser. The trapping column is 5 mm × 300 μm inner
diameter, packed with Agilent stable bond C18. The analytical
column used is 360 μm × 75 μm (outer × inner diameter) and is
pulled and packed in-house using a Sutter P-2000 laser puller
and pressure bombs with Magic 200 Å C18 material, respectively. The analytical column is packed to between 10 and 15 cm.
15. All in-house-built scripts and configured database can be
found on the laboratory’s website for download. Additionally,
links are provided to all the relevant pieces of software (http://
www.biotech.wisc.edu/sussmanlab/home).
15
N Labeling for Phosphoproteomics
375
16. Mascot Daemon requires any version of Windows 2000 or
newer.
17. In-house-built scripts were coded in Perl. In order to run Perl
scripts on a Windows-based system, ActivePerl must be
downloaded.
18. Census is built in Java and thus operating system independent;
however, it requires Java V6 or newer.
19. 1 L of liquid medium will provide for approximately 13
magenta boxes or flasks. Due to the fact that fungal and/or
bacterial contamination occurs, it is recommended to start
approximately 25 % more boxes/flasks than are sufficient to
obtain the desired number of replicates.
20. A small seed scoop can be made by melting a 0.2 mL PCR
tube to a metal E. coli loop. Trim the tube to just above the
loop with a razor blade. Scoop volumes will vary; take several
measurements to find the average amount of seeds held. Adjust
number of scoops added per box/flask accordingly to achieve
approximately 12 mg/container.
21. Suspend the seeds fully in ethanol. The suspension can then be
spread evenly across sterile filter paper for quick and efficient
drying.
22. The sharpie is used as a crude metric for the effectiveness of
the fumes used for sterilization—within roughly 3 h, the
sharpie should begin to fade due to the corrosive nature of
the fumes produced. Overnight (15–16 h), fine and ultrafine
sharpie markers will fade almost entirely, and thicker sharpies
will fade ~50–60 %. Though crude, it has pointed to ineffectiveness of the procedure in the past due to poor desiccator sealing.
23. Take care to pour seeds directly into media to avoid getting
seeds stuck on the wall of the box/flask. Handle the boxes/
flasks gently also to avoid the seeds adhering to the sides.
Seeds are very difficult to return to the media once adhered.
Some will adhere anyway, usually, while moving cubes and
while shaking during growth.
24. Orbital shaker is set up with fluorescent light fixtures placed
approximately 12 in. above shaking platform. A fan is used to
circulate air and counteract potential heating from light (see
Fig. 4).
25. Each experiment contains two treatment sets: set one,
containing 14N-treated flasks, 15N-control flasks and set two,
containing 14N-control flasks, 15N-treated flasks. See Fig. 1 [7].
26. Prechill mortar and pestle with liquid nitrogen immediately
prior to adding and grinding tissue.
27. The easiest way to do this is to, using previously collected
50 mL disposable centrifuge tubes on dry ice, quickly weigh
376
Benjamin B. Minkoff et al.
out material and combine in a 15 mL disposable centrifuge
tube, also on dry ice. As the material begins to thaw, it becomes
much harder to work with, sticking to any surface due to
moisture.
28. It is important to perform this as quickly and coldly as possible.
Use a cold room if available.
29. Due to polycarbonate Oak Ridge tubes having relatively small
openings, it is recommended to filter into a 50 mL disposable
centrifuge tube (which is much easier) and then pour filtrate
into an Oak Ridge tube.
30. The scale of the methanol/chloroform extraction is limited
solely by each laboratory’s capacity for growing and processing tissue. It can be scaled up or down while maintaining the
described ratios, providing the capacity exists for centrifugation at room temperature, and appropriately sized polypropylene (for use with chloroform) tubes can be acquired.
31. The phases will be clearly separated. Protein will appear as a
white layer between the clear upper aqueous phase and the
green lower organic phase. Some protein can be lost; however,
it is best to err on the side of leaving part of the aqueous
phase, rather than removing some of the protein layer.
32. The wash step involving acetone can be done in duplicate or
triplicate if desired.
33. Be sure to use glass syringes with neat formic acid. If, after
addition of formic acid, pH > 3, add up to 0.5 % formic acid
and retest pH. A pH ≤ 3 is necessary to arrest digestion.
34. Due to the nature of the setup described in this protocol, variability will exist in the amounts of pressure manually applied to
push solution/peptides through column.
35. The protein may pellet as either a white powder or a gelatinous pellet, potentially lightly brown or yellow. The powder is
much more easily solubilized than the pellet, and both should
be vortexed to resuspend fully. This variability in appearance
has had no observable effect on further processing and
analysis.
36. The amount of resin to use will vary based on amount of
protein chosen to digest. For a 5 mg digestion (the maximum
capacity of Waters’ 3CC tC18 Sep-Pak columns), it is advised
to use less than 5 mg TiO2 resin. Phosphorylated peptides
exist at such low abundance that not much resin is needed to
capture a sufficient percentage for analysis; additionally, as the
amount of resin is increased, the capacity for unphosphorylated peptides to bind is increased as well.
37. The ammonium hydroxide solution should be made fresh
everyday. TiO2 enrichment is performed. It is suggested that
15
N Labeling for Phosphoproteomics
377
50 μL of lactic acid solution be used to solubilize the first
elution and then transferred to the second elution, where a
50 μL further lactic acid solution is added and vortexed.
38. The phosphorylated angiotensin-II peptide is used as a
control for the TiO2 enrichment. If the appropriately modified
database was downloaded from the website given, all
experimental controls have been added.
39. The amount that the pellet should be diluted is variable. Note
that very little peptide is returned from an enriched sample, so
as little 0.1 % formic acid as possible should be used. Generally,
4–6 μL of solubilized pellet are injected onto LC column for
analysis, and ideally, three injections of the same sample should
be done to assay instrument reproducibility, ensure maximum
phosphopeptide identification, and provide statistics on
observed ratio measurements for each reciprocal experiment.
40. The following are a recommended set of buffers, LC gradient/flow conditions, and data collection methods. Buffer A
consists of Fisher 0.1 % formic acid in water. Buffer B consists
of 95 % acetonitrile/5 % Fisher 0.1 % formic acid in water.
Buffer for isocratic pump consists of 1 % acetonitrile/0.1 %
formic acid in water. Samples are loaded onto trapping column for 20 min at 15 μL/min using the isocratic pump, with
the Nanopump flowing 1.0 % B at 200 nL/min onto the analytical column. Column switching then occurs, and while the
isocratic pump flows under the same conditions directly to
waste, the Nanopump flows through the trapping column and
onto the analytical column at 200 nL/min from 1.0 to 40.0 %
B over 195 min, from 40.0 to 60.0 % B over 5 min, and then
from 60.0 to 100.0 % B over 3 min, where it flows at 100.0 %
B for 2 min. Following this, the Nanopump flows 100.0–1.0 %
B over 1 min, after which it flows 1.0 % B for 15 min.
41. A total of 240 min analysis time is used per sample, in conjunction with the LC system. MS scans are taken using a
resolving power of 100,000, and FTMS preview mode is
enabled. The top five ions, excluding a charge state of 1 and
unassigned charge states, are selected for MS/MS. Dynamic
exclusion is used for 40 s with a repeat count of 1 and a list
size/capacity of 500. Precursor ions are fragmented via collisionally induced dissociation using a normalized collision
energy of 35.0, activation Q and time of 0.25 and 30 ms,
respectively, and an isolation width of 2.5.
42. The search conditions (Mascot Daemon v2.2) are as follows.
Using the provided database and Trypsin, the “AUTO”
option under “top hits” is used, allowing one missed cleavage
and a peptide tolerance of ±30 ppm. The monoisotopic peaks
are used, as well as a 13C count of 2. 2+ and 3+ peptides are
378
Benjamin B. Minkoff et al.
specified. Phosphorylated S/T/Y and deamidated N/Q are
set as variable modifications, and carbamidomethylation of
cysteine is set as a fixed modification. For MS/MS ion search
conditions, a tolerance of ±0.6 Da is set.
43. The most useful file is the peptide summary with scores.
Examining the histograms is useful as well; they should center
around 1, with the important changes falling towards the outskirts of what should be an approximately Gaussian distribution. If mixing of the 14N/15N tissue was skewed from a perfect
1:1 ratio, this can be reflected by a skewed average ratio.
Normalization is done on a per replicate basis prior to combining replicates and producing a larger dataset, based on the
histogram after examination.
Acknowledgements
The authors would like to acknowledge Greg Barrett-Wilt and
Kelli Kline for work involved with development and implementation of the methods described in this chapter, as well as the
University of Wisconsin-Madison Biotechnology Center Mass
Spectrometry/Proteomics facility for instrument time, various
reagents, lab space, and advice throughout this process.
References
1. Yates JR, Ruse CI, Nakorchevsky A (2009)
Proteomics by mass spectrometry: approaches,
advances, and applications. Annu Rev Biomed
Eng 11:49–79
2. Schreiber TB, Mausbacher N, Breitkopf SB,
Grundner-Culemann K, Daub H (2008)
Quantitative phosphoproteomics—an emerging key technology in signal-transduction
research. Proteomics 8:4416–4432
3. Gouw JW, Krijgsveld J, Heck AJ (2010)
Quantitative proteomics by metabolic labeling of
model organisms. Mol Cell Proteomics 9:11–24
4. Kline KG, Sussman MR (2010) Protein quantitation using isotope-assisted mass spectrometry. Annu Rev Biophys 39:291–308
5. Ong SE, Blagoev B, Kratchmarova I,
Kristensen DB, Steen H, Pandey A, Mann M
(2002) Stable isotope labeling by amino acids
in cell culture, SILAC, as a simple and accurate
approach to expression proteomics. Mol Cell
Proteomics 1:376–386
6. Huttlin EL, Hegeman AD, Harms AC,
Sussman MR (2007) Comparison of full versus
partial metabolic labeling for quantitative proteomics analysis in Arabidopsis thaliana. Mol
Cell Proteomics 6:860–881
7. Kline KG, Barrett-Wilt GA, Sussman MR (2010)
In planta changes in protein phosphorylation
induced by the plant hormone abscisic acid. Proc
Natl Acad Sci U S A 107:15986–15991
8. Dunn JD, Reid GE, Bruening ML (2010)
Techniques for phosphopeptide enrichment
prior to analysis by mass spectrometry. Mass
Spectrom Rev 29:29–54
9. Vogel HJ (1989) Phosphorus-31 nuclear magnetic resonance of phosphoproteins. Methods
Enzymol 177:263–282
10. Yoshiyuki Koizumia MT (2002) Kinetic evaluation of biocidal activity of titanium dioxide
against phage MS2 considering interaction
between the phage and photocatalyst particles.
Biochem Eng J 12:107–116
11. Keller A, Eng J, Zhang N, Li XJ, Aebersold R
(2005) A uniform proteomics MS/MS analysis platform utilizing open XML file formats.
Mol Syst Biol 1(2005):0017
12. Park SK, Venable JD, Xu T, Yates JR III
(2008) A quantitative analysis software tool for
mass spectrometry-based proteomics. Nat
Methods 5:319–322
13. Nelson CJ, Huttlin EL, Hegeman AD, Harms
AC, Sussman MR (2007) Implications of
15
N Labeling for Phosphoproteomics
15
N-metabolic labeling for automated peptide
identification
in
Arabidopsis thaliana.
Proteomics 7:1279–1292
14. Hegeman AD, Schulte CF, Cui Q, Lewis IA,
Huttlin EL, Eghbalnia H, Harms AC, Ulrich
379
EL, Markley JL, Sussman MR (2007) Stable
isotope assisted assignment of elemental
compositions for metabolomics. Anal Chem
79:6912–6921
Chapter 20
Gene Expression Profiling Using DNA Microarrays
Kyonoshin Maruyama, Kazuko Yamaguchi-Shinozaki,
and Kazuo Shinozaki
Abstract
In Arabidopsis research, microarrays have typically been employed for the measurement of gene expression
under different conditions. Microarray analysis is often used to analyze the effects of the expression of
wild-type genes (control) versus mutants, the effects of varying environmental conditions, and the effects
of hormones. In addition, microarray analysis is used to analyze differences in gene expression between
growth stages and tissues. Other array applications include comparative genomic hybridization, chromatin
immunoprecipitation, mutation detection, and genotyping. This chapter focuses on gene expression profiling, which is typically performed by the competitive hybridization of two samples, each labeled with a
fluorescent dye such as cyanine 3-CTP or cyanine 5-CTP. We describe the steps, from RNA purification to
data analysis, that are involved in obtaining data from DNA microarrays.
Key words RNA purification, DNA microarray, Expression profiling, Microarray data analysis
1
Introduction
DNA microarray technology is a powerful research tool that enables
global measurement of the differences between paired nucleic acid
samples. Nearly two decades have passed since the first microarrays
were created, and various applications, including gene expression
profiling, CGH, SNP, ChIP-on-chip, and DNA methylation, have
been developed. This chapter focuses on gene expression profiling,
which may be considered as a five-step process: (1) RNA purification, (2) labeling of the samples, (3) hybridization and washing of
the slides, (4) signal detection, and (5) data analysis.
The RNA purification protocol we describe here is valid for
Arabidopsis, in addition to soybean and rice plants. It is extremely
important for the purification of the total RNA that the plant materials be kept frozen during the grinding process by repeatedly adding excess liquid N2. The low temperature is needed to inactivate
the cellular RNases. RNAiso Plus and TRIzol Reagent are ready-touse, monophasic solutions of phenol and guanidine isothiocyanate
Jose J. Sanchez-Serrano and Julio Salinas (eds.), Arabidopsis Protocols, Methods in Molecular Biology, vol. 1062,
DOI 10.1007/978-1-62703-580-4_20, © Springer Science+Business Media New York 2014
381
382
Kyonoshin Maruyama et al.
that are suitable for the purification of total RNA [1–5]. Moreover,
RNA sample quantitation is an essential step in microarray analyses,
as it is necessary to use intact total RNA to obtain reliable results.
We recommend to use the Agilent 2100 Bioanalyzer to determine
the quality of the total RNA. This bioanalyzer, with its RNA kit, is
the industry standard for RNA quality control.
The number of companies that produce microarray platforms,
including Affymetrix, Agilent Technologies, Illumina, Applied
Biosystems, and GE Healthcare, and the variety of protocols available to researchers have increased during the last years. DNA microarray analysis typically uses either a one-color or two-color platform
to measure the transcription products. Microarrays are currently
affordable and have acceptable reproducibility and accuracy for
many applications. The MicroArray Quality Control (MAQC) project demonstrated that six representative microarray platforms provided high reproducibility, and the data quality was essentially
equivalent between the one- and two-color approaches [6, 7]. In
this chapter, the Agilent Technologies’ platform is recommended
for gene expression profiling. This platform is the most sensitive,
and the results generated are highly reproducible [8]. Agilent’s
Low Input Quick Amp Labeling Kit generates fluorescent complimentary RNA (cRNA) using a sample containing between 10 and
200 ng of total RNA. This method uses T7 RNA polymerase, which
simultaneously amplifies the target material and incorporates cyanine 3- or cyanine 5-labeled CTP. Using this kit, the amplification
is typically at least 100-fold from the total RNA to cRNA.
Because there is no standard method for microarray data analysis, the data analysis step is the most important and difficult.
Indeed, many articles regarding analytical methods for microarray
data have been published [5, 9–14], and depositing microarray
data and statistical analyses have become conditions for publication
in most journals. Nonetheless, it is difficult to choose the appropriate statistical methods for microarray data analyses, which often
relies on the microarray experiment design. In some cases, the
GeneSpring software is recommended. This software can be used
even by biologists with limited experience in microarray analysis.
2
2.1
Materials
RNA Analysis
1. Latex gloves.
2. Mortar and pestle (Grinding equipment).
3. Spatula.
4. 2 ml Eppendorf tubes.
5. Vortex mixer.
6. Microtube mixer.
7. High-speed microcentrifuge.
DNA Microarray Analysis
383
8. Centrifuge desiccator.
9. NanoDrop ND-1000 UV–VIS Spectrophotometer (Thermo
Fisher Scientific Inc.).
10. Liquid N2.
11. Ultrapure water.
12. 75 % (v/v) ethanol.
13. 99.5 % (v/v) ethanol.
14. Isopropanol.
15. RNAiso Plus (Takara) or TRIzol Reagent (Invitrogen).
16. 3 M sodium acetate, pH 5.2.
17. High-salt buffer (0.8 M sodium citrate and 1.2 M sodium
chloride).
18. RNA 6000 Nano Kit (Agilent Technologies).
19. Agilent 2100 Bioanalyzer (Agilent Technologies).
20. IKA vortex mixer (Agilent Technologies).
2.2 Microarray
Analysis
1. Low Input Quick Amp Labeling Kit, Two-Color (Agilent
Technologies).
2. RNA Spike-In Kit, Two-Color (Agilent Technologies).
3. Gene Expression Hybridization Kit (Agilent Technologies).
4. Gene Expression Wash Buffer Kit (Agilent Technologies).
5. DNase/RNase-free distilled water (Agilent Technologies).
6. RNeasy Mini Kits (Qiagen).
7. 99.5 % (v/v) ethanol.
8. Microarray Scanner (Agilent Technologies).
9. Hybridization Chamber, stainless (Agilent Technologies).
10. Hybridization Chamber gasket slides (Agilent Technologies).
11. Hybridization oven (Agilent Technologies).
12. Hybridization oven rotator (Agilent Technologies).
13. Nuclease-free 1.5 ml tubes.
14. Magnetic stir bar (×2).
15. Microcentrifuge.
16. NanoDrop ND-1000 UV–VIS Spectrophotometer (Thermo
Fisher Scientific Inc.).
17. Slide-staining dish, with slide rack (X3).
18. Thermal cycler.
19. Clean forceps.
20. Powder-free gloves.
21. Vortex mixer.
384
3
3.1
Kyonoshin Maruyama et al.
Methods
RNA Purification
3.1.1 Purification
of Total RNA
1. Harvest the plants, and place in liquid N2 as soon as possible
(within 10 s).
2. Transfer the frozen plants (150–300 mg) to a mortar containing liquid N2, and grind to a very fine powder using the pestle.
The plants should be kept frozen during the grinding by adding liquid N2 (see Note 1).
3. Transfer the powdered material (~100 mg) to a precooled (in
liquid N2) 2 ml Eppendorf tube using a precooled spatula, and
place each tube in liquid N2.
4. When N2 evaporates, add 1 ml RNAiso Plus (or TRIzol
Reagent) to each tube, and mix well using a microtube mixer
for 5–10 min (see Note 2).
5. Centrifuge at 12,000 × g for 15 min at 4 °C, and transfer 800 μl
of the supernatant to a new tube (see Note 3).
6. Add chloroform (200–400 μl) to each sample, and mix well
using a microtube mixer for 5 min at room temperature.
7. Centrifuge at 12,000 × g for 10 min at 4 °C, and transfer 400 μl
of the upper layer to a new tube (see Note 4).
8. Add 250 μl of high-salt buffer and 250 μl of isopropanol to
each sample tube, and mix well using a microtube mixer for
5 min at room temperature.
9. Centrifuge at 12,000 × g for 10 min at 4 °C, and after the careful removal of the supernatant, dissolve the pellet in 100 μl of
ultrapure water.
10. Add 10 μl of sodium acetate and 250 μl of 99.5 % (v/v) ethanol to each sample tube, and mix using a microtube mixer for
60 s at room temperature.
11. Centrifuge at 12,000 × g for 10 min at 4 °C, and after removing the supernatant, add 400 μl of 75 % ethanol to each
sample.
12. Centrifuge at 12,000 × g for 10 min at 4 °C and discard the
supernatant retaining the RNA pellet.
13. Dry the RNA pellet using the centrifuge desiccator and resuspend it in 30 μl of ultrapure water.
14. Quantitate the total RNA using the NanoDrop 1000
Spectrophotometer, and prepare a solution of 200 ng/μl of
total RNA (see Note 5).
3.1.2 Quality Control
of Total RNA
1. Prepare 550 μl of RNA 6000 Nano gel matrix in a spin filter,
and centrifuge the matrix at 1,500 × g for 10 min at room
temperature.
DNA Microarray Analysis
385
2. Transfer 65 μl of the filtered gel into 0.5 ml RNase-free
microfuge tubes, add 1 μl of RNA 6000 Nano dye solution,
and mix well using a vortex mixer. Then centrifuge at 12,000 × g
for 10 min at room temperature. Before use, allow the RNA
6000 Nano dye solution to equilibrate to room temperature
for 30 min, and mix well using a vortex mixer.
3. Prepare a new RNA 6000 Nano chip on the chip-priming station, and load 9.0 μl of the gel-dye mixture into the well
marked G. Make sure that the plunger is positioned at 1 ml,
and then close the chip-priming station. Press the plunger until
it is held by the clip. Wait for exactly 30 s, and then release the
clip. After another 10 s, slowly pull back the plunger to the
1 ml position, and open the chip-priming station.
4. Transfer 9.0 μl of the gel-dye mix into the wells marked G.
Load 5 μl of the RNA 6000 Nano marker into each of the 12
sample wells and into the well-marked ladder. Load 1 μl of the
prepared ladder into well-marked “ladder.” Load 1 μl of the
RNA sample into each of the 12 sample wells. Transfer 1 μl of
the RNA 6000 Nano Marker into each of the unused sample
wells.
5. Place the chip horizontally in the adapter for the IKA vortex
mixer, and mix well for 60 s at 14.5 × g.
6. Process the chip in the Agilent 2100 Bioanalyzer within 5 min.
3.2 Preparation of
Labeled Samples
3.2.1 Preparation of
Cyanine 3-CTP or Cyanine
5-CTP Labeling Reactions
1. Place 200 ng/1.5 μl of diluted total RNA, 2 μl of final diluted
Spike mixture and 1.8 μl of diluted T7 promoter primer mixture in a 0.2 ml microcentrifuge tube. Each tube should now
contain a total volume of 5.3 μl (see Notes 6 and 7).
2. Incubate the reactions in a thermal cycler for 10 min at 65 °C
to denature the primer and the RNA sample.
3. Place the reactions on ice, and incubate them for 5 min; centrifuge each sample briefly to collect the content at the bottom
of the tubes.
4. Add 4.7 μl of cDNA master mixture to each sample tube, and
mix by pipetting up and down; incubate the reactions at 40 °C
in a thermal cycler for 2 h. Each tube should now contain a
total volume of 10 μl (see Note 8).
5. Incubate the reactions in a thermal cycler for 15 min at 70 °C
to inactivate the AffinityScript enzyme. Place the reactions on
ice, and incubate for 5 min; centrifuge each sample briefly to
collect the content at the bottom of the tubes.
6. Add 6 μl of transcription master mixture to each sample tube.
Gently mix by pipetting, and incubate the samples in a thermal
cycler for 2 h at 40 °C. Each tube should now contain a total
volume of 16 μl (see Note 9).
386
Kyonoshin Maruyama et al.
3.2.2 Purification of
Labeled/Amplified cRNA
1. Transfer the cRNA sample to a 1.5 ml tube, and add 84 μl of
nuclease-free water for a total volume of 100 μl.
2. Add 350 μl of Buffer RLT and 250 μl of 99.5 % ethanol, and
mix by pipetting up and down. Centrifuge each sample briefly
to collect the content at the bottom of the tubes. Each tube
should now contain a total volume of 700 μl.
3. Transfer the 700 μl of the cRNA sample to an RNeasy mini
column in a 2 ml collection tube. Centrifuge the sample at
12,000 × g for 60 s at 4 °C. Discard the flow-through and collection tube.
4. Transfer the RNeasy column to a new collection tube, and add
500 μl of buffer RPE to the column. Centrifuge the sample at
12,000 × g for 60 s at 4 °C. Discard the flow-through. Re-use
the collection tube.
5. Add another 500 μl of buffer RPE to the column. Centrifuge
the sample at 12,000 × g for 60 s at 4 °C. Discard the flowthrough and the collection tube.
6. If any buffer RPE remains on or near the rim of the column,
transfer the RNeasy column to a new 1.5 ml collection tube,
and centrifuge the sample at 12,000 × g for 60 s at 4 °C to
remove any remaining traces of the buffer RPE. Discard this
collection tube, and use a fresh tube to elute the clean cRNA
sample.
7. Elute the clean cRNA sample by transferring the RNeasy column to a new 1.5 ml collection tube. Add 30 μl of RNase-free
water directly to the RNeasy filter membrane. Wait 60 s, and
then centrifuge at 12,000 × g for 60 s at 4 °C.
8. Maintain the flow-through, which contains the cRNA, on ice.
9. Quantitate the labeled/amplified cRNA using the NanoDrop
1000 Spectrophotometer (see Note 5).
10. Determine the yield of each labeled/amplified cRNA. Use the
concentration of the cRNA (ng/μl) to determine the cRNA
yield (in micrograms) as follows: (concentration of
cRNA) × 30 μl (elution volume)/10,000 = μg of cRNA.
3.3 Hybridization
and Washing
of the Slides
3.3.1 Hybridization
1. Add 825 ng of cyanine 3-labeled linearly amplified cRNA,
825 ng of cyanine 5-labeled linearly amplified cRNA, 11 μl of
diluted 10× Blocking Agent, and 2.2 μl of 25× Fragmentation
Buffer, and mix gently by pipetting. Prepare the reactions
using a total volume of 55 μl.
2. Incubate the reaction mixtures at 60 °C for exactly 30 min to
fragment the RNA.
3. Place the reactions on ice, and incubate for 60 s.
DNA Microarray Analysis
387
4. Add 55 μl of 2× GEx Hybridization Buffer HI-RPM to stop
the fragmentation reaction, and mix well by careful pipetting.
Take care to avoid introducing bubbles. Do not mix using a
vortex mixer. Centrifuge at 12,000 × g for 60 s at room temperature to collect the contents at the bottom of the tube.
5. Use immediately. Do not store. Place the sample on ice, and
load onto the array as soon as possible.
6. Load a clean gasket slide onto the Agilent SureHyb chamber
base with the label facing up and aligned with the rectangular
section of the chamber base. Ensure that the gasket slide is
flush with the chamber base and is not ajar.
7. Slowly dispense 100 μl of hybridization sample onto the gasket
well in a “drag and dispense” manner.
8. Slowly place an array “active side” down onto the SureHyb
gasket slide, so that the “Agilent”-labeled barcode is facing
down and the numeric barcode is facing up. Make sure the
sandwich-pair is properly aligned.
9. Place the SureHyb chamber cover onto the sandwiched slides,
and slide the clamp assembly onto both pieces.
10. Hand-tighten the clamp onto the chamber.
11. Vertically rotate the assembled chamber to wet the gasket, and
assess the mobility of the bubbles. If necessary, tap the assembly on a hard surface to move stationary bubbles.
12. Place the assembled slide chamber on a rotator in a hybridization oven set to 65 °C. Set your hybridization rotator to rotate
at 10 rpm when using the 2× GEx Hybridization Buffer
HI-RPM.
13. Hybridize at 65 °C for 17 h.
3.3.2 Washing the
Microarray Slides
1. With the sandwich completely submerged in Gene Expression
Wash Buffer 1, pry the sandwich open from the barcode end
only. Slip one of the blunt ends of the forceps between the
slides, and gently turn the forceps upwards or downwards to
separate the slides. Let the gasket slide drop to the bottom of
the staining dish. Remove the microarray slide, and place it
into the slide rack in the slide-staining dish 2 containing the
Gene Expression Wash Buffer 1 at room temperature. Minimize
the exposure of the slide to air. Touch only the barcode portion of the microarray slide or its edges (see Notes 10–12).
2. When all of the slides in the group are placed into the slide rack
in the slide-staining dish 2, stir for 1 min at room temperature.
3. During this wash step, remove the Gene Expression Wash
Buffer 2 from the 37 °C water bath and pour into slide-staining
dish 3.
388
Kyonoshin Maruyama et al.
4. Transfer the slide rack to slide-staining dish 3 containing warm
Gene Expression Wash Buffer 2. Stir for 1 min.
5. Slowly remove the slide rack, minimizing droplets on the
slides. It should take 5–10 s to remove the slide rack.
6. Place the slides in a slide holder so that the Agilent barcode
faces up. Scan the slides immediately to minimize the impact of
environmental oxidants on the signal intensities.
3.4 Signal Detection
(See Note 13)
1. Place the assembled slide holders into the scanner carousel.
2. In the Scan Control main window, choose the slot number of
the first slide for the Start Slot and the slot number for the last
slide for the End Slot.
3. Select Profile
microarrays.
AgilentHD_GX_2Color
for
4x44
K
4. In the Scan Control main window, click Scan Slot m-n, where
m is the slot of the first slide, and n is the slot for the last slide.
5. Open the Agilent Feature Extraction (FE) software, and open
the images (.tif).
6. Save the FE Project (.fep) by selecting File > Save As, and
browse for the desired location > Start Extracting.
7. After the extraction is successfully completed, view the QC
report for each extraction set by double-clicking the QC
Report link in the Summary Report tab. Determine whether
the grid has been properly placed by using the Spot Finding
tool at the four corners of the array.
3.5 Data Analysis
(See Note 14)
1. Open the GeneSpring GX program, select Project > New
Project > in Create New project window, create a project name,
and click the OK button.
2. In the Experiment Selection Dialog window, select Create new
experiment, and click the OK button.
3. In the Experiment description window, create an Experiment
name and select Agilent expression Two color as the Experiment
type and Guided Workflow-Find-differentially expressed Genes
as the Workflow type. Then, click the OK button.
4. In the Load Data window, click Choose Files, and select your
microarray .txt files. Then, click the Next >> button.
5. Confirm your Dye-swap arrays analysis, and click the Finish
button.
6. In the Summary Report window, click the Next >> button.
7. In the Experiment Grouping window, click the Add
Parameter… button, and create a parameter name. Select NonNumeric as the Parameter type, and then create Parameter
Values. Then, click the OK button.
DNA Microarray Analysis
389
8. Confirm the Experiment Grouping window, and click the Next
>> button.
9. Confirm the QC on samples window, and click the Next >>
button.
10. Confirm the Filter Probesets window, and click the Next >>
button.
11. Confirm the Significance Analysis window, and click the Next
>> button.
12. Confirm the Fold Change window, and click the Next >>
button.
13. Confirm the GO Analysis window, and click the Next >>
button.
14. Confirm the Find Significant Pathways Results window, and
click the Finish button.
15. In Project Navigator > your experiment > Analysis folder >
right-click on T-test, P < 0.05, and select Export List. Then,
save a microarray text file.
16. This process normalizes the microarray raw data using the
Lowess normalization method. The expression log ratios and
Benjamini and Hochberg false discovery rate P values
(Corrected P-value) are also calculated by GeneSpring GX.
4
Notes
1. The plants can be easily ground using grinding equipment.
2. After adding RNAiso Plus (or TRIzol Reagent) the solution
often freezes. Homogenize the frozen solution as quickly as
possible using a vortex or microtube mixer. The isolated total
RNA is intact and does not contain small amounts of DNA or
proteins. This RNA can be used for microarray, qRT-PCR, and
RNA gel blot analyses.
3. Transfer the supernatant to a new tube. Be careful not to collect any of the cellular debris.
4. After centrifugation, the solution separates into three layers.
The upper layer will be a clear liquid containing the RNA, the
middle layer will be a semisolid layer containing the DNA, and
the bottom layer will be a red-colored organic solvent containing the proteins, polysaccharides, fatty acids, cellular debris,
and a small amount of DNA. Be careful not to collect any of
the middle layer. Steps 5 and 6 should be performed again if
the middle layer has been mixed with the top layer. When isolating RNA from rice and soybean, steps 5 and 6 should be
performed again.
390
Kyonoshin Maruyama et al.
5. The NanoDrop 1000 Spectrophotometer will accurately measure the concentration of RNA samples up to 3,000 ng/μl
without dilution. A 1.5–2 μl aliquot of RNA sample is recommended to ensure that a liquid sample column is formed and
that the light path is completely covered by the sample.
6. To prepare the final diluted Spike mixture, (1) mix the thawed
Spike A or B mixture well using a vortex mixer, incubate for
5 min at 37 °C, and mix well a second time. Centrifuge the
reaction mixtures briefly to collect the content at the bottom
of the tubes. (2) Transfer 2 μl of the Spike A or B mixture into
a new tube, and add 38 μl of the Dilution Buffer provided in
the Spike-In kit (1:20). Mix well using a vortex mixer.
Centrifuge the reactions briefly to collect the contents at the
bottom of the tube. This tube contains the first dilution. (3)
Transfer 2 μl of the Spike A or B mixture into a new tube, and
add 78 μl of the Dilution Buffer (1:40). Mix well using a vortex mixer. Centrifuge the reactions briefly to collect the contents at the bottom of the tube. This tube contains the second
dilution. (4) Transfer 2 μl of the Spike A or B mixture into a
new tube, and add 30 μl of the Dilution Buffer (1:16). Mix
well using a vortex mixer. Centrifuge the reactions briefly to
collect the contents at the bottom of the tube. This tube contains the final diluted Spike mixture.
7. To prepare the diluted T7 promoter primer mixture, mix 0.8 μl
of the T7 Promoter Primer and 1 μl of nuclease-free water.
8. To prepare the cDNA master mixture, mix 2 μl of 5× FirstStrand Buffer, 1 μl of 0.1 M DTT, 0.5 μl of 10 mM dNTP mix,
and 1.2 μl of AffinityScriptRNase Block Mix.
9. To prepare the transcription master mixture, mix 0.75 μl of
nuclease-free water, 3.2 μl of 5× Transcription Buffer, 0.6 μl of
0.1 M DTT, 1 μl of NTP mix, 0.21 μl of T7 RNA Polymerase
Blend, and 0.24 μl of Cyanine 3-CTP or cyanine 5-CTP.
10. The microarray wash procedure for Agilent’s two-color platform must be performed in an environment in which the ozone
level is 5 ppb or less.
11. To prepare the 10× Blocking Agent, add 500 μl of nucleasefree water to the vial containing the lyophilized 10× Blocking
Agent supplied with the Agilent Gene Expression Hybridization
Kit. Centrifuge the solution briefly to collect the content at the
bottom of the tube.
12. To set up the apparatus for the washes, completely fill slidestaining dish 1 with Gene Expression Wash Buffer 1 at room
temperature. Place a slide rack into the slide-staining dish 2.
Add a magnetic stir bar. Fill the slide-staining dish 2 with sufficient Gene Expression Wash Buffer 1 at room temperature to
cover the slide rack. Place this dish on a magnetic stir plate.
DNA Microarray Analysis
391
Place empty dish 3 on the stir plate, and add a magnetic stir
bar. Do not add the pre-warmed (37 °C) Gene Expression
Wash Buffer 2 until the first wash step has begun. Remove one
hybridization chamber from the incubator, and record the
time. Record whether bubbles have formed during the hybridization and whether all of the bubbles are rotating freely.
13. The microarrays are scanned using an Agilent dual-laser DNA
microarray scanner with SureScan technology. The data are
extracted from the images by the Agilent Feature Extraction
software.
14. Microarray raw data are analyzed by the GeneSpring GX software. We recommend reading the GeneSpring PDF Manual
when a more detailed analysis is desired.
References
1. Wallace DM (1987) Large- and small-scale phenol extractions. Methods Enzymol 152:33–41
2. Coombs LM et al (1990) Simultaneous isolation of DNA, RNA, and antigenic protein
exhibiting kinase activity from small tumor
samples using guanidine isothiocyanate. Anal
Biochem 188:338–343
3. Nicolaides NC, Stoeckert CJ Jr (1990) A simple, efficient method for the separate isolation
of RNA and DNA from the same cells.
Biotechniques 8:154–156
4. Chomczynski P, Sacchi N (1987) Single-step
method of RNA isolation by acid guanidinium
thiocyanate-phenol-chloroform
extraction.
Anal Biochem 162:156–159
5. Raha S et al (1990) Simultaneous isolation of
total cellular RNA and DNA from tissue culture cells using phenol and lithium chloride.
Genet Anal Tech Appl 7:173–177
6. MAQC Consortium (2006) The MicroArray
Quality Control (MAQC) Project shows interand intraplatform reproducibility of gene
expression measurements. Nat Biotechnol
24:1151–1161
7. Patterson TA et al (2006) Performance comparison of one-color and two-color platforms
8.
9.
10.
11.
12.
13.
14.
within the MicroArray Quality Control
(MAQC) Project. Nat Biotechnol 24:
1140–1150
Hardiman G (2004) Microarray platforms—
comparisons and contrasts. Pharmacogenomics
5:487–502
Draghici S et al (2006) Reliability and reproducibility issues in DNA microarray measurements. Trends Genet 22:101–109
Ioannidis JP et al (2009) Repeatability of published microarray gene expression analyses. Nat
Genet 41:149–155
Jafari P, Azuaje F (2006) An assessment of
recently published gene expression data analyses: reporting experimental design and statistical factors. BMC Med Inform Decis Mak 6:27
Reiner A, Yekutieli D, Benjamini Y (2003)
Identifying differentially expressed genes using
false discovery rate controlling procedures.
Bioinformatics 19:368–375
Storey JD, Tibshirani R (2003) Statistical significance for genomewide studies. Proc Natl
Acad Sci U S A 100:9440–9445
Konishi T (2011) Microarray test results should
not be compensated for multiplicity of gene
contents. BMC Syst Biol 5:S6
Chapter 21
Forward Chemical Genetic Screening
Hyunmo Choi, Jun-Young Kim, Young Tae Chang, and Hong Gil Nam
Abstract
Chemical genetics utilizes small molecules to perturb biological processes. Unlike conventional genetics
methods, which involve the alteration of genetic information mostly with lasting effects, chemical genetics
allows temporary and reversible alterations of biological processes. Furthermore, it enables the alteration
of biological processes in a dose-dependent manner, providing an advantage over conventional genetics.
In the present chapter, the general procedures of forward chemical genetic screening are described.
Forward chemical genetic screening can be performed in three steps. The first step involves the identification of small molecules that induce phenotypic or physiological changes in a biological system from a
chemical library. In the second step, cellular targets that interact with the isolated chemical, which are
mostly proteins, are identified. Although several methods can be applied in the second step, the most common one is affinity pull-down assay using a target protein that binds to the isolated compound. However,
affinity pull-down of a target protein is a formidable barrier in forward chemical genetics. We introduced
a tagged chemical library approach that significantly facilitates the identification of target proteins. The
third step consists of the validation of the target protein, which should include the assessment of target
specificity. This step is critical because small molecules often show pleiotropic effects due to low specificity.
The specificity test may include a competition assay using cold competitors and a genetic study using
mutants or transgenic lines modified for the cellular target.
Key words Forward chemical genetics, Chemical screening, Target identification, Tagged chemical
library, Specificity
1
Introduction
Arabidopsis is a well-established plant genetic model for the investigation of various aspects of plant biology due to its rich genetic
resources and genetic amenability, which have led to an unprecedented success in molecular genetic characterization of various
plant processes. The critical advantages of Arabidopsis as a plant
genetic model system include the established pools of insertion
mutants and facile generation of transgenic lines. However, these
genetic mutants or transgenic lines are limited in their value for the
elucidation of important aspects of plant biology. For example, the
mechanism of action of lethal genes may not be easily revealed
Jose J. Sanchez-Serrano and Julio Salinas (eds.), Arabidopsis Protocols, Methods in Molecular Biology, vol. 1062,
DOI 10.1007/978-1-62703-580-4_21, © Springer Science+Business Media New York 2014
393
394
Hyunmo Choi et al.
because genetic mutations lead simply to lethality, although an
antisense approach may be utilized to overcome some of the lethality problems [1]. Furthermore, conventional genetic methods are
associated with long-lasting effects that can hamper the observation of the direct or immediate effects of a gene of interest. Several
strategies have been developed to overcome the shortcomings of
conventional genetic mutations. Chemical genetics is an emerging
approach that relies on the ability of small molecule chemicals to
mimic genetic mutations by acting on cellular targets. Chemical
treatment to modulate the activity or function of cellular targets
provides a few advantages over conventional genetic approaches.
The duration of the effect on the target can be adjusted, and the
effect can be reversed, thus enabling a more direct assessment of a
cellular target. Chemicals may be applied locally, thus mimicking a
tissue-/organ-specific modulation of gene function. Variations in
the target gene function or activity can be examined by using various doses of chemicals, allowing the study of the effect of lethal
genes. Furthermore, chemical genetics can be applied to various
species that are not genetically tractable. Chemicals identified from
a plant species such as Arabidopsis can be utilized to investigate the
function of a homolog gene in a related species. Chemical genetics
is now successfully employed for the elucidation of various complex mechanisms, which may not have been feasible with conventional genetics, such as the study of auxin signaling [2–4],
endo-membrane system components [5], and vacuolar sorting [6].
Yet, the cellular effects of small molecule chemicals may not be as
specific as the mutation of a given gene, and this point needs to be
borne in mind when applying chemical genetics (Fig. 1).
Chemical genetics can be classified into forward chemical
genetics (i.e., phenotype-based approach) and reverse chemical
genetics (i.e., target-based approach). Forward chemical genetics
proceeds from the altered phenotype or physiology to the corresponding target genes, similar to classical forward genetics [7].
However, while conventional genetics is based on the screening of
a pool of mutant plants, in chemical genetics a pool of small
Fig. 1 Forward and reverse chemical genetic approach
Forward Chemical Genetic Screening
395
molecules is screened for their effects on phenotype or physiology.
Chemicals that alter the phenotype or physiology of interest are
then isolated and used to identify a target. In reverse chemical
genetics, as in conventional reverse genetics, the targets, which are
usually proteins, are first defined, and a chemical library is screened
for compounds that interact with the target protein. These chemical compounds are then used to determine the phenotypic or physiological consequences of altering the function of the target protein
in a cellular context. The present chapter describes a forward
chemical genetics protocol that includes the screening of small
molecules for a given phenotype or physiology, the identification
of target proteins, and the validation of the target [8].
A key step in forward chemical genetics is the identification
of the cellular targets, which can be approached in several ways.
In one approach, cellular targets may be inferred from the responses
of plants to the chemical and known physiological responses and
later confirmed by various means such as binding of the chemical
to the inferred target in vitro or in planta [4, 9]. Screening of
genetic mutants that confer altered sensitivity to the chemical can
provide information on the target of the chemical compound.
Mutated genes are among the candidate cellular targets, although
the altered sensitivity of a mutant to a specific compound may be
due to an indirect effect of the chemical [10]. The cellular target
can also be identified by pulling down the target protein that binds
to the chemical. Usually, it involves addition of a linker molecule to
the screened chemical without affecting its activity using structure–activity relationship (SAR) studies. The compound with the
added linker is then attached to a solid phase matrix such as agarose beads to make an affinity matrix. This affinity matrix is used to
pull down binding targets from cellular extracts. The matrix-bound
proteins are usually separated by SDS-PAGE to examine the target
protein, which can be identified by mass spectrometry [11, 12].
After the candidate target is identified, a functional validation is
necessary to confirm that the bound proteins are the actual targets.
This can be done by examining the phenotypic or physiological
effects of the chemical in knockout, knockdown, and overexpression plants (Fig. 2).
Fig. 2 Modification of hit chemical for target identification
396
Hyunmo Choi et al.
The affinity-based approach for target identification has practical
difficulties. The addition of a linker to a chemical compound in the
appropriate position while minimizing the effect on its activity
requires a thorough SAR study. This step is time-consuming and
laborious, and normal biology laboratories are not familiar with
this procedure. Sometimes this modification is not feasible without
the loss of activity. To overcome these difficulties, the tagged
chemical library approach was introduced [13]. In the tagged
chemical library, chemicals already contain the linker molecule
necessary for the preparation of an affinity matrix. Thus, the subsequent modification of the hit compound is not required. The
tagged chemical library used in this study contained a triethylene
glycol (TG)-based linker with a terminal amine functionality that is
utilized for immobilization of the chemical on a solid matrix [14].
Here, we describe a protocol based on the tagged chemical library.
2
Materials
1. Dry seeds of Arabidopsis thaliana.
2. 1.5-mL microfuge tubes.
3. 24-well plates.
4. 10-cm Petri dishes.
5. 20-μm pore-size polyethylene frit cartridge.
6. 20-mL glass vial.
7. Microcentrifuge.
8. Orbital shaker.
9. Rotary mixer.
10. Tagged chemical library: the chemicals of the tagged triazine
library used contained a triethylene glycol-based linker with a
terminal amine (TG-NH2) [13, 14].
11. Sterilized double-distilled water.
12. Dimethyl sulfoxide (DMSO).
13. Murashige and Skoog (MS) medium (0.5×) containing microelements without sucrose (pH 5.7) and with 0.8 % phyto agar.
Note that the media must be adjusted according to the screening strategy for a specific phenotype, as the medium affects
phenotype, especially that of seedlings.
14. Affigel-10.
15. N,N-Diisopropylethylamine (DIEA).
16. Ethanolamine.
17. Sodium azide.
18. Liquid nitrogen (N2).
Forward Chemical Genetic Screening
397
19. Seed surface-sterilization solution: 10 % Sodium hypochlorite
solution containing 0.1 % Triton X-100.
20. Extraction buffer: 20 mM Tris–HCl (pH 7.5), 150 mM NaCl,
2 mM EDTA, 1 mM NaF, 1 mM PMSF, 1 mM DTT, 10 mM
β-glycerophosphate, protease inhibitor cocktail (Roche,
Mannheim, Germany), 1 % Triton X-100, and 0.1 % sodium
dodecyl sulfate. Washing buffer does not include Triton X-100
or SDS.
21. SDS-PAGE gels: 0.375 M Tris–HCl (pH 8.8), 10 % acrylamide/bis-acrylamide solution, 0.1 % SDS, 0.05 % ammonium
persulfate, 0.05 % (v/v) TEMED.
22. Coomassie blue staining solutions: Fixing solution (50 %
methanol and 10 % glacial acetic acid), staining solution (0.1 %
Coomassie Brilliant Blue R-250, 50 % methanol and 10 % glacial acetic acid), and destaining solution (40 % methanol and
10 % glacial acetic acid).
3
Methods
3.1 Primary
Chemical Screening
The scheme is diagrammed in Fig. 3. The candidate chemicals
from this screening can be called “Hit” compounds.
1. Pour 1.5 mL of 0.5× MS culture media with 0.8 % phyto agar
into each well of a 24-well plate. Allow the agar to solidify at
room temperature (see Notes 1–3).
2. Each chemical from the chemical library is applied to the surface
of the culture medium in each well. As a control, one or two
wells in each plate should only contain solvent (2 μm of chemical
was used for primary screening in our case) (see Note 4).
3. Prepare the necessary amount of seeds considering that three
seeds will be sown in each well.
Fig. 3 Schematic diagram of the primary chemical screening
398
Hyunmo Choi et al.
4. Surface-sterilize seeds in microfuge tubes by adding 1 mL of
10 % (v/v) sodium hypochlorite solution containing 0.1 %
Triton X-100 as surfactant and by shaking or vortexing for
5 min. The seeds are collected by centrifugation in a microcentrifuge for a few seconds and the supernatant is removed. The
seeds are washed five times with sterile water and incubated in
tubes for 2 days at 4 °C in the dark to synchronize germination. After sowing the seeds on media, the plates are placed
under white light for 24 h to promote germination (see Note 5).
Make sure that the seeds sown in each well do not stick
together.
5. Maintain the experimental conditions until the phenotype can
be observed (see Note 6).
6. Sort the chemicals into hit or inactive chemicals according to
their phenotypic effects. Chemicals that elicit effects similar to
those of the solvent control in each plate should be excluded.
In addition to finding a hit chemical, it is also important to
identify an inactive compound with a structure similar to that
of the hit compound that could be used as a negative control.
3.2 Secondary
Screening
Once a hit chemical with a phenotypic effect is found, it must be
confirmed by performing a secondary screening including doseresponse assessment and determination of the IC50 value, as shown
in Fig. 4.
1. Pour 60 mL of 0.5× MS culture media with 0.8 % phyto agar
into each 10-cm Petri dish (see Note 4).
2. Prepare enough plates by mixing culture media with the
following chemicals: solvent only, a hit chemical (from low to
high concentration), or an inactive control chemical (from low
to high concentration) (see Note 7).
Fig. 4 Schematic diagram of the secondary screening of chemicals
Forward Chemical Genetic Screening
399
3. Prepare the necessary amount of seeds, while taking into
consideration that more than 30 seeds will be sown in each
Petri dish (see Note 8).
4. Surface-sterilize the prepared seeds, stratify them, and sow on
media plate (see Subheading 3.1, Step 4).
5. Maintain the experimental conditions for phenotypic screening
(see Note 9).
3.3 Target
Identification
3.3.1 Bead Conjugation
For the identification of cellular targets of the hit compound by the
pull-down assay, a solid phase matrix needs to be covalently
attached to the linker of the tagged chemical utilizing the terminal
amine.
1. Shake the bottle of Affigel-10 gently to obtain a homogeneous
suspension.
2. Transfer 0.5 mL (7.5 μmol) of Affigel-10 to a 3-mL cartridge
with a 20-μm pore-size polyethylene frit.
3. Drain the supernatant solvent and wash the gel with DMSO.
4. Prepare 375 μL of 10 mM TG-NH2-linked chemical (dissolved
in DMSO) into a 20-mL vial. Add 125 μL of DMSO to adjust
the total volume of solution to 0.5 mL.
5. Add 50 μL DIEA to the vial with the TG-NH2-linked
chemical.
6. Transfer the contents of the vial to the 3-mL cartridge with
Affigel-10.
7. Shake well for 3 h at room temperature on an orbital shaker
with a speed setting of 500 rpm.
8. Drain the solution and wash the product with DMSO.
9. Add 50 mM ethanolamine solution in 1 mL of DMSO and
15 μL DIEA to the reaction cartridge to block side reactions.
Shake for 3 h at room temperature on an orbital shaker with a
speed setting of 500 rpm.
10. Drain the solution and wash the product with DMSO, water,
and then a 2 % sodium azide solution in water to protect the
product from bacterial contamination.
11. The Affigel-10 product can now be stored in an E-tube in 2 %
sodium azide solution in water (1 mL) at 4 °C.
12. For affinity pull-down assay, the sodium azide solution should
be removed. Spin down the bead-conjugated chemical for
1–2 s at 800 × g in a microcentrifuge at 4 °C. Drain the supernatant containing the sodium azide. Wash the pellet three
times with washing buffer.
3.3.2 Affinity
Pull-Down Assay
The Affigel-bound chemical is incubated with the cell extract to
isolate cellular target proteins. SDS-PAGE is used to examine the
pull down proteins. To exclude nonspecific binding proteins, a
400
Hyunmo Choi et al.
competition assay is used. The isolated proteins are then identified
by mass spectrometry.
1. Prepare 200 seeds of Arabidopsis in a microfuge tube, surfacesterilize, stratify, and sow on media plates (see Subheading 3.1,
step 4).
2. Incubate the plates under the given experimental condition
until the phenotype can be observed (see Note 10).
3. Freeze plant samples in liquid N2. Grind to a very fine powder
in a precooled (−70 °C) mortar and pestle.
4. Transfer this powder to a microfuge tube. Add extraction
buffer to the powder and mix thoroughly (the volume of the
extraction buffer depends on sample mass: 200 μL per 0.1 g
sample). Maintain on ice for 10 min with occasional inversion
of the tube.
5. Centrifuge the mixture for 10 min at 4,000 × g in a microcentrifuge at 4 °C.
6. Transfer the supernatant into a new tube on ice. Measure the
protein concentration in each tube.
7. Adjust the protein concentration in each tube so that all of the
tubes have the same protein concentration.
8. Add 5 volumes of the washing buffer to the cell lysate on ice.
9. Add 30 μL of Affigel-10 to reduce nonspecific binding.
Incubate for 1 h at 4 °C in a rotary mixer with gentle mixing.
10. Spin down the Affigel-10 for 1–2 s at 800 × g in a microcentrifuge at 4 °C.
11. Aliquot the total lysates to five new microfuge tubes on ice,
and add the washing buffer to make a total volume of 1 mL
(1 μg/μL). The details of each tube are given below:
Tube 1. Target screening tube to pull-down target proteins
from the cell extract using a bead-conjugated hit compound (prepared as described in Subheading 3.3.1).
Tube 2. Competition assay tube to pull down proteins from
the cell extract using a bead-conjugated hit compound
after preincubation of the cell extract with the unconjugated hit compound (prepared as described in
Subheading 3.3.1).
Tube 3. Cell extract.
Tube 4. Bead control to pull down proteins that bind nonspecifically to the unconjugated beads from the cell extract.
Tube 5. Inactive control to pull down proteins from the cell
extract using a bead-conjugated inactive compound (prepared as described in Subheading 3.3.1).
Forward Chemical Genetic Screening
401
12. Place the hit chemical compound that is unconjugated to the
beads in only tube 2. Add the same volume of solvent to the
other four tubes (Tube 1, 3, 4, and 5).
13. Incubate all the tubes for 1 h at 4 °C in a rotary mixer with
gentle mixing.
14. Add the beads conjugated to the hit compound to tubes 1
and 2.
15. Add unconjugated agarose beads to tube 4 and the bead conjugated with an inactive chemical compound to tube 5 as controls (Fig. 5).
16. Incubate the five tubes for 2–4 h at 4 °C in a rotary mixer with
gentle mixing.
17. Centrifuge the tubes for 1–2 s at 800 × g in a microcentrifuge
at 4 °C.
18. Drain the supernatant. Store the tubes on ice. Wash the pellet
three times with the washing buffer.
19. Add SDS gel-loading buffer to each tube. Boil the tubes for
5 min at 95 °C.
20. Perform SDS-PAGE.A gradient gel with constant current was
used in our case (Subheading 2, item 21).
21. Visualize the protein bands by Coomassie blue staining
(Subheading 2, item 22) or EBT silver staining [15].
22. Excise the band of interest from the gel and place it in a
microfuge tube.
23. Determine the identity of target proteins by mass spectrometry.
3.4 Biological
Validation
Seeds of candidate target Arabidopsis mutant lines are available
from the Arabidopsis Biological Resource Center at Ohio State
(e-mail: arabidopsis+@osu.edu) and the Nottingham Stock Centre
Fig. 5 Schematic diagram of the affinity pull-down assay
402
Hyunmo Choi et al.
(e-mail: Arabidopsis@nottingham.uk). If a specific target mutant is
not available from public resources, the generation of an RNAi or
overexpression line is required. Even if the mutant lines are available, generation of RNAi and/or overexpression lines is recommended to examine the dependence of the chemical phenotype on
the level of protein expression.
1. Pour 60 mL of 0.5× MS culture media (with 0.8 % phyto agar)
into 10-cm Petri dishes.
2. Prepare culture media with no solvent, solvent only, a hit
chemical (from low to high concentration), or an inactive control chemical (from low to high concentration) (see Note 7).
3. Prepare 300 seeds of candidate mutant lines (and RNAi and
overexpression lines, if available) (see Note 8). Surface-sterilize
the prepared seeds and stratify them (see Subheading 3.1, step 4).
Sow the seeds on the medium plate with no solvent.
4. After germination, transfer the seedlings onto each of the
medium plates prepared in step 2.
5. Maintain under the experimental condition for the required
period.
4
Notes
1. Culture media components can affect the phenotype during
growth. For general practice, we recommend 0.5× MS culture
media (with microelements and without sucrose).
2. Solvent effect: plant phenotypes can be affected by the concentration of the solvent used to dissolve the chemical compounds.
In the case of DMSO, no apparent effect on the growth of
Arabidopsis seedlings was observed up to 0.1 % (v/v) concentration of culture media.
3. Edge effect: the culture media in the wells at the edges of
24-well plates can dry more easily than the culture media in the
wells at the center of 24-well plates. This can affect the growth
of plants. To avoid this edge effect, the edges can be wrapped
with film after closing the lids. This significantly reduces the
edge effect.
4. If plants are grown for more than 2 weeks in culture media in
24-well plates or Petri dishes, the culture media may dry out.
If such a long incubation is required to observe the phenotype,
the volume of the culture media must be increased. The plates
also need to be changed to 12- or 6-well plates to provide sufficient space for growth.
5. If the experiment is not specifically related to photomorphogenesis, temperature response, or circadian clock, we recommend
Forward Chemical Genetic Screening
403
a long day condition (16 h light/8 h darkness) or a continuous
light condition at 22 °C. To avoid pleiotropic effects of the
compounds on the germination process or on very young
seedlings, compounds may be administered a few days after
germination.
6. It is critically important to decide which phenotype should be
observed for the proper selection of chemical compounds.
Various phenotypes can be observed depending on the purpose of the experiment, such as organ swelling [16], hypocotyl
growth inhibition [17], agravitropic response [5], pin-formed
inflorescence [4], and leaf bleaching [18]. If a noticeable phenotypic alteration is observed, the chemical may be categorized as a hit compound. Those without a phenotypic effect
but with a similar structure should be used as inactive controls
later in target identification.
7. For primary high-throughput screening, the incorporation of
the chemicals into the culture media may be difficult due to the
large number of plates required for compound screening. For
the small volumes of culture media used in 24-well plates, it is
sufficient to put the chemical compound on the surface of the
culture media as it easily diffuses into the media. In this case,
the seeds need be sown after 3 or more hours to allow the
chemicals to diffuse evenly. However, if an increased volume of
culture media is required, the chemicals need to be mixed with
the media thoroughly before it solidifies. In this case, the culture media should be cooled down enough to prevent damage
to temperature-sensitive chemicals.
8. The effect of a chemical compound can be validated by statistical analysis. For each compound, more than 30 plants should
be used for determining statistical significance. If the p-value of
the phenotypic difference is 0.05 or less for a specific compound, it can be established as a hit. Considering the potential
problem posed by variations in the phenotypes of plants grown
in different Petri dishes, it is recommended that all experiments should be repeated at least three times.
9. Once a candidate hit is discovered, optimized derivatives can
be generated later, depending on the requirement.
10. Proteins extracted from the whole plant up to the seedling
stage can be used for target identification. However, when
the true leaves emerge, the amount of plastid proteins such
as RuBisCO strongly increases. In this case, an affinity column with antibodies against plastid proteins can be used to
reduce their concentration in total protein lysates. If the root
is the target organ, the direct use of root-extracted protein is
feasible.
404
Hyunmo Choi et al.
References
1. Jun J, Kim CS, Cho DS, Kwak JM, Ha CM,
Park YS, Cho BH, Patton DA, Nam HG
(2002) Random antisense cDNA mutagenesis
as an efficient functional genomic approach in
higher plants. Planta 214:668–674
2. Zhao Y, Dai X, Blackwell HE, Schreiber SL,
Chory J (2003) SIR1, an upstream component
in auxin signaling identified by chemical genetics. Science 301:1107–1110
3. Armstrong JI, Yuan S, Dale JM, Tanner VN,
Theologis A (2004) Identification of inhibitors
of auxin transcriptional activation by means of
chemical genetics in Arabidopsis. Proc Natl
Acad Sci U S A 101:14978–14983
4. Kim JY, Henrichs S, Bailly A, Vincenzetti V,
Sovero V, Mancuso S, Pollmann S, Kim D,
Geisler M, Nam HG (2010) Identification of
an ABCB/P-glycoprotein-specific inhibitor of
auxin transport by chemical genomics. J Biol
Chem 285:23309–23317
5. Surpin M, Rojas-Pierce M, Carter C, Hicks
GR, Vasquez J, Raikhel NV (2005) The power
of chemical genomics to study the link between
endomembrane system components and the
gravitropic response. Proc Natl Acad Sci U S A
102:4902–4907
6. Zouhar J, Hicks GR, Raikhel NV (2004)
Sorting inhibitors (sortins): chemical compounds to study vacuolar sorting in Arabidopsis.
Proc Natl Acad Sci U S A 101:9497–9501
7. Blackwell HE, Zhao Y (2003) Chemical
genetic approaches to plant biology. Plant
Physiol 133:448–455
8. Das RK, Samanta S, Ghosh K, Zhai D, Xu W,
Su D, Leong C, Chang YT (2011) Target identification: a challenging step in forward chemical genetics. IBC 3(3):1–16
9. Crews CM, Splittgerber U (1999) Chemical
genetics: exploring and controlling cellular
processes with chemical probe. Trends Biochem
Sci 24:317–320
10. Zheng XFS, Chan TF, Zhou HH (2004)
Genetic and genomic approaches to identify
and study the targets of bioactive small molecules. Chem Biol 11:609–618
11. Khersonsky SM, Chang YT (2004) Strategies
for facilitated forward chemical genetics.
Chembiochem 5:903–908
12. Kim YK, Chang YT (2007) Tagged library
approach facilitates forward chemical genetics.
Mol Biosyst 3:392–397
13. Khersonsky SM, Jung DW, Kang TW, Walsh
DP, Moon HS, Jo H, Jacobson EM, Shetty V,
Neubert TA, Chang YT (2003) Facilitated forward chemical genetics using tagged triazine
library and zebrafish embryo screening. J Am
Chem Soc 125:11804–11805
14. Ahn YH, Chang YT (2007) Tagged small molecule library approach for facilitated chemical
genetics. Acc Chem Res 40:1025–1033
15. Jin L, Hwang S, Yoo G, Choi J (2006) A mass
spectrometry compatible silver staining method
for protein incorporating a new silver sensitizer
in sodium dodecyl sulfate-polyacrylamide electrophoresis gels. Proteomics 6:2334–2337
16. DeBolt S, Gutierrez R, Ehrhardt DW, Melo CV,
Ross L, Cutler SR, Somerville C, Bonetta D
(2007) Morlin, an inhibitor of cortical microtubule dynamics and cellulose synthase movement.
Proc Natl Acad Sci U S A 104:5854–5859
17. Asami T, Min YK, Nagata N, Yamagishi K,
Takatsuto S, Fujioka S, Murofushi N, Yamaguchi I,
Yoshida S (2000) Characterization of brassinazole, a triazole-type brassinosteroid biosynthesis
inhibitor. Plant Physiol 123:93–100
18. Walsh TA, Bauer T, Neal R, Merlo AO,
Schmitzer PR, Hicks GR, Honma M,
Matsumura W, Wolff K, Davies JP (2007)
Chemical genetic identification of glutamine
phosphoribosylpyrophosphate amidotransferase as the target for a novel bleaching herbicide
in Arabidopsis. Plant Physiol 144:1292–1304
Chapter 22
Highly Reproducible ChIP-on-Chip Analysis to Identify
Genome-Wide Protein Binding and Chromatin Status
in Arabidopsis thaliana
Jong-Myong Kim, Taiko Kim To, Maho Tanaka, Takaho A. Endo,
Akihiro Matsui, Junko Ishida, Fiona C. Robertson, Tetsuro Toyoda,
and Motoaki Seki
Abstract
Gene activity is regulated via chromatin dynamics in eukaryotes. In plants, alterations of histone modifications
are correlated with gene regulation for development, vernalization, and abiotic stress responses. Using
ChIP, ChIP-on-chip, and ChIP-seq analyses, the direct binding regions of transcription factors and alterations of histone modifications can be identified on a genome-wide level. We have established reliable and
reproducible ChIP and ChIP-on-chip methods that have been optimized for the Arabidopsis model system. These methods are not only useful for identifying the direct binding of transcription factors and
chromatin status but also for scanning the regulatory network in Arabidopsis.
Key words Arabidopsis, Histone, Chromatin, ChIP-on-chip
1
Introduction
Posttranslational modifications, such as histone modifications, are
one of the critical events to regulate transcription and genome
structure in eukaryotes [1–7]. In plants, the gene regulation of
flowering, vernalization, and abiotic stress responses are correlated
to histone modifications [8–13].
“ChIP-on-chip” and “ChIP-seq” are very powerful techniques
that can be used to detect genome-wide changes in DNA–protein
binding activity and chromatin status, combining chromatin
immunoprecipitation (“ChIP”) with tiling array technology
(“chip”) and high-throughput sequencing technology, respectively
[14–17]. Although genome-wide analysis using ChIP-on-chip of
both chromatin marks and transcription factor binding has been
previously reported for Arabidopsis [18–22], the ChIP-on-chip
assay has not yet become a widespread technique in Arabidopsis.
Jose J. Sanchez-Serrano and Julio Salinas (eds.), Arabidopsis Protocols, Methods in Molecular Biology, vol. 1062,
DOI 10.1007/978-1-62703-580-4_22, © Springer Science+Business Media New York 2014
405
406
Jong-Myong Kim et al.
Steps in Methods
Fixation of fresh plants
3.1.
Breaking the fixed plants
3.2.
Chromatin Shearing
3.3.
Immunoprecipitation and ChIPed DNA purification
3.4.-3.5.
Amplification of purified-ChIPed-DNA
3.6.
Preparation of hybridization probe for tiling array
3.7.-3.8.
Hybridization with tiling array
3.9.-3.10.
Scanning and data analysis
3.11.-3.12.
Fig. 1 Workflow of ChIP-on-chip analysis
This is primarily due to the difficulties associated with the optimization of the ChIP assay conditions and the generation of reproducible results. Moreover, the ChIP-on-chip procedure involves
numerous steps and as a result this makes it complicated for troubleshooting within each step. We have established a ChIP [11] and
ChIP-on-chip protocol that has been optimized for Arabidopsis
and proven to be reliable and reproducible (Fig. 1). In this protocol, fresh plants are used without freeze-thawing as a means to
prevent the disruption of the protein interactions of interest. This
method is capable of handling both small- and large-scale ChIP
assays (Fig. 2). In the ChIP-on-chip method, we combined a T/A
ligation technique for the attachment of a dsDNA adaptor and an
in vitro transcription system to amplify a sufficient amount of
cRNAs. As a component of the in vitro transcription system, the
detection substrate is incorporated into the amplified cRNA fragments for hybridization. This provides high integrity and reproducibility to our results. Also, ChIPed DNA that has been prepared
by our ChIP method is immediately available for subsequent
ChIP-seq analysis using optimized amplification and sequencing
procedures that have been provided from each manufacturer of
high-throughput sequencers. In this chapter, we describe the
protocols that we have developed for Arabidopsis ChIP and ChIPon-chip analyses to identify site-specific and genome-wide DNA–
protein binding and chromatin status.
2
Materials
2.1 Fixation and
Quenching of Plants
1. Two-week-old whole Arabidopsis seedlings (roots and entire
aerial parts) grown in petri dishes containing GM agar (0.85 %)
medium supplemented with 1 % sucrose under 16 h light/8 h
dark cycle (40–80 μmol photons/m2 s, light period: 5:00 a.m.
to 9:00 p.m.) [11].
ChIP-on-Chip Analysis in Arabidopsis
407
Fig. 2 Schematic diagram of chromatin immunoprecipitation
2. 1 M HEPES, adjusted to pH 7.5 using 10N KOH. Autoclave
and store at room temperature.
3. Formaldehyde. Store at room temperature.
4. Vacuum system, including pumps and plastic bell, connected
to a freeze-dryer.
5. 2.5 M Glycine solution. Store at room temperature.
2.2 Extraction of Cell
Lysate from Plants
1. Metal tubes, SUST-0050 (Bio Medical Science).
2. Protease inhibitor cocktail tablet, (Complete, EDTA free,
Roche). One tablet is dissolved in 50 mL of 50 mM HEPES
buffer and is prepared just prior to use.
3. Tungsten balls, SS150-0050 (Bio Medical Science).
4. Aluminum tube holder unit, AB 50-0005 (Bio Medical Science).
408
Jong-Myong Kim et al.
5. Plant shredding equipment: Shake Master Auto, BMS A20-TP
(Bio Medical Science).
6. Filter unit: cell strainer 100 μm, No. 352360 (BD Falcon).
7. Protein low-binding plastic tube of 50 mL size: SUMILON
Proteosave, MS-5950 (SUMILON).
8. Protein low-binding plastic tube of 15 mL size: SUMILON
Proteosave, MS-59150 (SUMILON).
9. 2× 150 mM lysis buffer: 25 mL of 1 M HEPES buffer, pH 7.5,
50 mL of 3 M NaCl, 2 mL of 0.5 M EDTA, 50 mL of 20 %
Triton X-100, 10 mL of 10 % Na deoxycholate, and 5 mL of
20 % SDS. Make up to 500 mL with distilled water and store
at room temperature.
2.3 Chromatin
Shearing
1. 150 mM lysis buffer: 25 mL of 1 M HEPES buffer, pH 7.5,
25 mL of 3 M NaCl, 1 mL of 0.5 M EDTA, 25 mL of 20 %
Triton X-100, 5 mL of 10 % Na deoxycholate, and 2.5 mL
20 % SDS. Make up to 500 mL with distilled water and store
at room temperature.
2. Self-standing plastic tube of 25 mL size: Centrifuge Tubes
Mini with triple seal cap, No. 2362-025 (IWAKI, JAPAN).
3. Sonicator: Astraon 3000, Model S3000-600, and probe tip:
1/2 flat tip (Misonix).
4. Phenol/chloroform/isoamyl alcohol 25:24:1 saturated with
10 mM Tris–HCl, pH 8.0, 1 mM EDTA (Sigma-Aldrich).
Store at 4 °C.
5. Ethanol.
6. 3 M sodium acetate buffer solution, S7899-100ML (SigmaAldrich). Store at room temperature.
7. Glycogen for Mol. Biol., No. 0-90-393-001 (Roche). Stored
at −20 °C.
8. Agilent DNA 1000 kit (Agilent). Store at room temperature.
9. Agilent 2100 Bioanalyzer (Agilent Technologies).
2.4 Chromatin
Immunoprecipitation
1. Dynabeads Protein G (Dynal).
2. Nutator (BD Clay Adams brand).
3. Anti-histone H4 tetra-acetylation polyclonal antibody, 06-866
(Millipore).
4. Magnet rack for 50 mL plastic tube: Dynal MPC-1 (Dynal).
5. 1.7 mL SafeSeal Microcentrifuge Tubes (Sorenson BioScience).
6. Magnet rack for 1.5 mL plastic tube: Dynal MPC-S (Dynal).
7. 500 mM lysis buffer: 25 mL of 1 M HEPES buffer, pH 7.5,
83.3 mL of 3 M NaCl, 1 mL of 0.5 M EDTA, 25 mL of 20 %
Triton X-100, 5 mL of 10 % Na deoxycholate, and 2.5 mL of
ChIP-on-Chip Analysis in Arabidopsis
409
20 % SDS. Make up to 500 mL with distilled water and store
at room temperature.
8. Deoxycholate buffer: 5 mL of 1 M Tris buffer, adjusted to pH
8.0 by 10 N HCl, 31.25 mL of 4 M LiCl. 12.5 mL of 20 %
NP-40 (Sigma-Aldrich), and 25 mL of 10 % Na deoxycholate.
Make up to 500 mL with distilled water, autoclaved, and store
at room temperature.
9. 10× TE, pH 8.0. Store at room temperature.
10. Elution buffer: 2.5 mL of 1 M Tris buffer, adjusted to pH 8.0
by 10 N HCl, 1 mL of 0.5 M EDTA, and 2.5 mL of 20 % SDS.
Make up to 50 mL with distilled water and store at room
temperature.
11. Hybridization incubator HB-80 (TAITEC).
12. RNase A (7,000 units/mL), No. 19101 (Qiagen). Store at 4 °C.
13. Proteinase K (>600 mAU/mL), No. 19131 (Qiagen). Store
at 4 °C.
14. QiaAmp DNA micro purification kit (Qiagen). Store at room
temperature.
15. DNase-/RNase-free water. Store at room temperature.
2.5 Evaluation of
ChIPed DNA Quality
and Enrichment
1. ExTaq DNA polymerase (5 units/μL) No. RR001A (TaKaRa)
and 10× ExTaq PCR buffer. Store at −20 °C.
2. Primers to detect the enrichment of an internal control,
Arabidopsis ACT7 (At5g09810) gene: forward primer, ACT7-F
5′-CGTTTCGCTTTCCTTAGTGTTAGCT, and reverse
primer, ACT7-R 5′-AGCGAACGGATCTAGAGACTCAC
CTTG (see Note 1).
3. 6 % acrylamide gel, 3 mL of acrylamide, bis-acrylamide (19:1)
gel solution (BioRad), and 3 mL of 5× TBE. Make up to 15 mL
with distilled water. Add 40 μL of 30 % ammonium persulfate
and 20 μL TEMED. Mix and pour into gel preparation system
(glass plate size: 9 cm × 15 cm × 1 mm) for electrophoresis system, BE-22R (BIO CRAFT). Prepare just before use.
4. Gel detection system: VISTA FluorImager SI, Filter: 610RG
and ImageQuant software (GE Healthcare).
5. SYBR Gold nucleic acid gel stain 10,000× concentrate in
DMSO (Life Technologies).
6. Thermal cycler.
2.6 Amplification
of ChIPed DNA
Fragments for Tiling
Array Hybridization
1. dNTP mixture (2.5 mM each) No. 4030 (TaKaRa).
2. DNA T4 polymerase (3,000 units/mL) No. M0203L (NEB).
Store at −20 °C.
3. 10× NEB2 buffer (NEB). Store at −20 °C.
410
Jong-Myong Kim et al.
4. DNase-/RNase-free water. Store at room temperature.
5. 0.5 M EDTA. Store at room temperature.
6. 1× TE solution. Store at room temperature.
7. T4 polynucleotide kinase (10 units/μL) No. 18004-010 (Life
Technologies).
8. 100 mM ATP (TaKara).
9. 5× forward buffer (NBE).
10. Phenol/chloroform/isoamyl alcohol 25:24:1 saturated with
10 mM Tris–HCl, pH 8.0, 1 mM EDTA. Store at 4 °C.
11. Ethanol.
12. 3 M sodium acetate buffer solution, S7899-100ML (SigmaAldrich). Store at room temperature.
13. Glycogen for Mol. Biol., No. 0-90-393-001 (Roche). Store at
−20 °C.
14. QiaAmp DNA
temperature.
micro
kit
(Qiagen).
Store
at
room
15. NanoDrop ND-1000 Spectrophotometer (Thermo Fisher
Scientific).
16. T/A oligos for production of dsDNA adaptor: (see Note 2).
T/A ds_F, 5′-GCGGCCGCGAAATTAATACGACTCACTAT
AGGGAGT.
T/A ds_R, 5′-CTCCCTATAGTGAGTCGTATTAATTT.
17. T4 DNA ligase (2,000,000 cohesive end units/mL) No.
M0202T (NEB) and 10× ligation buffer. Store at −20 °C.
18. T7-c primer: 5′-CTTGGCGCGAAATTAATACGACTCACTATAGGGAGT.
19. ExTaq DNA polymerase (5 units/μL) No. RR001A (TaKara)
and 10× ExTaq PCR buffer. Store at −20 °C.
20. Thermal cycler.
2.7 Synthesis of
Biotin-Labeled cRNA
with IVT Reaction
1. NanoDrop ND-1000 Spectrophotometer (Thermo Fisher
Scientific).
2. One-Cycle Target Labeling and Control Reagents (the following reagents and materials are supplied from the manufacturer
Affymetrix: 10× IVT labeling buffer, IVT labeling NTP mix,
IVT labeling enzyme mix, IVT cRNA cleanup spin column,
IVT cRNA binding buffer, IVT cRNA wash buffer). IVT
cRNA cleanup spin column is stored at 4 °C and IVT cRNA
binding buffer is stored at −20 °C. IVT cRNA is stored at
−80 °C. Wash buffer is stored at room temperature and all
other reagents are stored at −20 °C.
3. Ethanol.
ChIP-on-Chip Analysis in Arabidopsis
2.8 Fragmentation
of the cRNA
411
1. 5× fragmentation buffer (Affymetrix). Store at room temperature.
2. Thermal cycler.
3. DNase-/RNase-free water. Store at room temperature.
4. Agilent 2100 Bioanalyzer (Agilent Technologies).
2.9
Hybridization
1. GeneChip Hybridization Oven 640 (Affymetrix).
2. GeneChip Arabidopsis tiling array (1.0F Array, Affymetrix) (see
Note 3). This can be stored for up to 6 months at 4 °C in the dark.
3. 250 μL micropipette tips HR-250S (Rainin) (see Note 4). Use
when applying the hybridization buffer to the tiling array.
4. 5 M NaCl (DNase-/RNase-free, Ambion).
5. 0.5 M EDTA. Store at room temperature.
6. 2× hybridization buffer: add 8.3 mL of 12× MES stock buffer,
17.7 mL of 5 M NaCl, 4 mL of 0.5 M EDTA, 0.1 mL of 10 %
Tween-20 to 19 mL RNase-free water, made up to 50 mL.
Store at 4 °C in dark.
7. GeneChip Eukaryotic Hybridization Control Kit (the following reagents and materials are supplied from the manufacturer
Affymetrix: 3 nM control oligo B2, 20× eukaryotic hybridization controls). Store at −20 °C.
8. 10 mg/mL herring sperm DNA (Promega). Store at −20 °C.
9. 50 mg/mL bovine serum albumin (BSA) (Invitrogen). Store
at −20 °C.
10. Dimethyl sulfoxide (DMSO). Store at room temperature.
11. DNase-/RNase-free water. Store at room temperature.
12. Heat block.
2.10 Washing
and Staining
1. GeneChip Fluidics Station 450 (Affymetrix).
2. 20× SSPE (3 M NaCl, 0.2 M NaH2PO4, 0.02 M EDTA,
Cambrex).
3. Wash buffer A: Add 300 mL of 20× SSPE and 1 mL of 10 %
Tween-20 (Pierce Chemical) to 650 mL of autoclaved distilled
water and make up to 1,000 mL with autoclaved distilled
water. Filter wash buffer A through 0.2 μm filter. This can be
stored for 3 months at 4 °C in the dark.
4. 12× 2-[N-morpholino]ethanesulfonic acid (MES) stock buffer: Add 3.2 g of MES free acid monohydrate and 9.7 g of
MES sodium salt (Sigma-Aldrich) to 40 mL of DNase-/
RNase-free water (Gibco) and make up to 50 mL with DNase-/
RNase-free water. Filter 12× MES stock buffer through 0.2 μm
filter. This can be stored for 3 months at 4 °C in dark.
5. Wash buffer B: Add 41.7 mL of 12× MES stock buffer, 2.6 mL
of 5 M NaCl, and 0.5 mL of 10 % Tween-20 to 400 mL of
412
Jong-Myong Kim et al.
autoclaved distilled water and make up to 500 mL with autoclaved distilled water. Filter wash buffer B through a 0.2 μm
filter. This can be stored for 3 months at 4 °C in the dark.
6. 10 mg/mL goat IgG stock: Add 50 mg of goat IgG (SigmaAldrich) to 5 mL of 150 mM NaCl solution (prepared from
5 M NaCl solution). If a larger volume of the 10 mg/mL IgG
stock is prepared, aliquot and store at −20 °C until use. After
thawing the solution, store at 4 °C. Avoid cycles of repeated
freezing and thawing.
7. 2× Stain buffer: Add 41.7 mL of 12× MES stock buffer,
92.5 mL of 5 M NaCl, and 2.5 mL of 10 % Tween-20 to
113.3 mL of RNase-free water. Filter the 2× stain buffer
through a 0.2 μm filter. This can be stored at 4 °C in the dark.
8. Anti-streptavidin antibody (goat),
Laboratories). Store at −20 °C.
biotinylated
(Vector
9. 1 mg/mL streptavidin phycoerythrin (SAPE) solution
(Molecular Probes). Store at 4 °C.
10. 50 mg/mL bovine serum albumin (BSA) (Invitrogen). Store
at −20 °C.
11. DNase-/RNase-free water. Store at room temperature.
2.11
Array Scanning
1. GeneChip Scanner 3000 7G (Affymetrix).
2. Tough-Spots (USA Scientific).
2.12 Computational
Analysis of ChIPon-Chip Data
3
1. MAS 5.0 algorithm (Affymetrix:http://www.affymetrix.com/
support/technical/whitepapers/sadd_whitepaper.pdf).
Methods
3.1 Fixation and
Quenching of Plants
1. Grow Arabidopsis plants in petri dishes (20 plants per petri
dish) containing GM agar (0.85 %) medium supplemented
with 1 % sucrose under 16 h light/8 h dark cycle (40–80 μmol
photons/m2 s, light period: 5:00 a.m. to 9:00 p.m.) [11].
2. Remove air from 200 mL of 50 mM HEPES buffer using an
aspirator. Warm the HEPES buffer to 22 °C and keep in an
incubator until use.
3. Pre-run the vacuum system and cool down the water trap
chamber inside the freeze-dryer for at least 30 min before use.
4. To make the fixation buffer, add 6 mL of formaldehyde (final
concentration 1 %) to the 200 mL of 50 mM HEPES buffer in
a 500 mL beaker just before use.
5. Carefully remove plants from plates to ensure that no agar is
transferred.
ChIP-on-Chip Analysis in Arabidopsis
413
6. Harvest whole plants (fresh weight: 1 g) and immediately
submerge in the 206 mL of fixation buffer containing formaldehyde (see Note 5).
7. Cover the beaker with a two-ply layer of parafilm and make 20
holes in the parafilm using forceps.
8. Set the beaker containing samples on the heating plate inside
the plastic bell for vacuum infiltration. Stack enough paper
towels on top of the beaker to prevent the formation of ice
cores from splashes of fixation buffer.
9. Start vacuum using the maximum vacuum speed to remove air
from samples. Maintain vacuum pressure between 60 and
133 Pa for 5 min (see Note 6). After 5 min, open the air valve
to quickly release the vacuum.
10. Briefly swirl the samples and confirm infiltration of the fixation
buffer into plants (see Note 7).
11. Repeat vacuum infiltration using the same procedure as
described above.
12. Keep fixed samples at 22 °C in incubator for 45 min.
13. Remove the parafilm cover and wipe away any extra fluid collected on the inside surface of the beaker.
14. Add 10 mL of 2.5 M glycine solution and gently mix by
swirling.
15. Again, cover the beaker with a two-ply layer parafilm and make
20 holes using forceps.
16. To quench the formaldehyde, repeat the vacuum infiltration
procedure twice in the same manner as described for fixation
(steps 8–11).
17. Keep fixed plants to quench at 22 °C in incubator for at least
30 min.
18. Remove the solution by decantation.
19. Add 200 mL of 50 mM HEPES buffer and wash the fixed
plants.
20. Repeat sample washing two times with rinses of 200 mL of
50 mM HEPES.
21. Dry the fixed plants (see Note 8) using paper towels.
3.2 Extraction of Cell
Lysate for Chromatin
Immunoprecipitation
1. Transfer samples to prechilled metal tubes and maintain on ice
for a few minutes (see Note 9).
2. Add two tungsten balls and 4 mL of prechilled 50 mM HEPES
buffer containing Complete tablet. Cover with metal lid and
wrap with parafilm.
3. To ensure that liquid does not leak from the lid, compress the
parafilm by rolling the tube on the bench top (see Note 10).
414
Jong-Myong Kim et al.
4. Place the metal tube in the aluminum tube holder and place
the holder in the plant shredding machine.
5. Grind samples with strong shaking for 13 min using the Shake
Master Auto.
6. Remove holder from the shredding machine and immediately
place on ice.
7. Take 10 mL of whole cell lysate and check the grinding efficiency
using microscopy.
8. Add 1 mL of 50 mM HEPES buffer containing Complete tablet
to the sample tube.
9. Resuspend ground samples by pipetting.
10. Pour 5 mL of ground samples onto the filter unit set on a
50 mL protein low-binding plastic tube.
11. Cover with parafilm and centrifuge ground samples for 5 min
at 400 × g, 4 °C.
12. Replace the filter unit and pour the remaining ground samples
onto the filter unit.
13. Add 1–2 mL of 50 mM HEPES buffer containing Complete
tablet to the metal tube and completely transfer all of the
ground samples to the filter unit.
14. Centrifuge for 10 min at 400 × g, 4 °C.
15. Remove the filter unit and transfer the supernatant (cell lysate)
to a 15 mL protein low-binding plastic tube. Make up the cell
lysate to 7.5 mL with 50 mM HEPES buffer containing
Complete tablet.
16. Add 7.5 mL of 2× 150 mM lysis buffer (see Note 11).
3.3 Chromatin
Shearing
1. Transfer 15 mL of cell lysate in 150 mM lysis buffer to a 25 mL
self-standing plastic tube and keep in ice water (see Note 12).
2. Sonicate the cell lysate using a sonicator in 150 mM lysis buffer
at an output level of 8.5 for 30 s and immediately return the
tube to ice water for at least 1 min (see Note 13).
3. Repeat this cycle 14 times.
4. Transfer the sonicated cell lysate in 150 mM lysis buffer to a
50 mL protein low-binding plastic tube.
5. Centrifuge for 10 min at 20,000 × g, 4 °C.
6. The resultant aqueous whole cell extract (WCE) is used to produce Input DNA and ChIPed DNA.
7. For the Input DNA, take 200 μL of WCE and extract DNA by
phenol/chloroform extraction and ethanol precipitation.
8. Check the fragment size range of the sheared DNA (see Note 14)
using the Agilent 2100 Bioanalyzer (Agilent Technologies).
ChIP-on-Chip Analysis in Arabidopsis
3.4 Chromatin
Immunoprecipitation
415
1. To remove nonspecific IP, add 30 μL of magnetic beads
(see Note 15) to 15 mL of WCE in a 50 mL protein lowbinding plastic tube.
2. Stir for 30 min using a nutator at 4 °C.
3. Collect magnetic beads using a magnet rack.
4. Transfer the WCE supernatant to a new 50 mL protein lowbinding tube.
5. Dispense 500 μL of prewashed WCE into a 1.7 mL protein
low-binding tube.
6. Add 4 μL of antibody (see Note 16).
7. Stir overnight using a nutator at 4 °C.
8. Add 30 μL of magnetic beads.
9. Stir for 4 h using a nutator at 4 °C.
10. Collect magnetic beads using a magnet rack.
11. Discard aqueous supernatant using an aspirator.
12. Add 1 mL of 150 mM lysis buffer to wash beads.
13. Invert the tube to resuspend magnetic beads.
14. Collect magnetic beads using a magnet rack.
15. Remove aqueous supernatant using an aspirator.
16. Repeat this washing step three times.
17. Add 1 mL of 150 mM lysis buffer.
18. Resuspend the magnetic beads by inversion.
19. Stir for 10 min using a nutator at room temperature.
20. Collect magnetic beads using a magnet rack.
21. Remove aqueous supernatant using an aspirator.
22. Add 1 mL of 500 mM lysis buffer.
23. Resuspend the magnetic beads by inversion.
24. Stir for 10 min using a nutator at room temperature.
25. Collect magnetic beads using a magnet rack and remove aqueous supernatant using an aspirator.
26. Add 1 mL of deoxycholate buffer.
27. Resuspend the magnetic beads by inversion.
28. Stir for 10 min using a nutator at room temperature.
29. Collect magnetic beads using a magnet rack and remove aqueous supernatant using an aspirator.
30. Add 1 mL of 1× TE.
31. Resuspend the magnetic beads by inversion.
32. Stir for 10 min using a nutator at room temperature.
33. Collect magnetic beads using a magnet rack and remove aqueous supernatant using an aspirator.
416
Jong-Myong Kim et al.
34. Add 400 μL of elution buffer, resuspend magnetic beads by
inversion, and transfer to a new 1.7 mL DNA low-binding
tube.
35. To reverse cross-linking, incubate samples overnight at 65 °C
in a hybridization oven.
36. Add 2 μL of RNase A and incubate for 30 min at 50 °C.
37. Add 5 μL of proteinase K and incubate for 30 min at 50 °C.
38. After cooling the samples to room temperature, extract DNA
(ChIPed DNA) by phenol/chloroform extraction and ethanol
precipitation. Allow the ChIPed DNA to air dry briefly.
39. Dissolve the ChIPed DNA in 100 μL of 1× TE solution.
40. Purify the ChIPed DNA using QiaAmp DNA micro purification kit (see Note 17) and elute with 30 μL DNase-/RNasefree water (see Note 18).
3.5 Evaluation of
ChIPed DNA Quality
and Enrichment
1. Mix 1 μL of 1 ng/mL ChIPed DNA, 1 μL of ExTaq DNA
polymerase, 4 μL of 10 mM dNTP, 0.25 μL of 100 μM primers to amplify the target region, 0.25 μL of 100 μM primers,
and 2.5 μL of 10× ExTaq PCR buffer in a total reaction volume of 25 μL to amplify ACT7 region (see Note 19).
2. Amplify Input DNA and ChIPed DNA by PCR. Cycle conditions are 94 °C for 5 min, [94 °C for 15 s, 58 °C for 30 s,
72 °C for 90 s] × 25cycles, and 72 °C for 1 min and store at
4 °C (see Note 20).
3. Apply 3 μL of PCR product to each well on a 6 % acrylamide
gel (see Note 21).
4. Separate PCR products by electrophoresis for 40 min at 200 V.
5. Stain DNA fragments in gel using 1 μL of SYBR Gold in
300 mL of distilled water by gently shaking for 5 min.
6. Gently agitate stained gel in distilled water for 10 min at room
temperature to remove the background fluorescence.
7. Measure the intensity of fluorescence of bands using a
FluorImager. Calculate the signal intensity and fold enrichment
using ImageQuant imaging software (see Notes 22 and 23).
3.6 Amplification
of ChIPed DNA
Fragments for
Hybridization
of Tiling Array
1. Gently mix 200 ng of ChIPed DNA with 0.5 μL of DNA T4
polymerase, 4.4 μL of dNTP, 11 μL of 10× NEB2 buffer, and
DNase-/RNase-free water in a total volume of 110 μL.
2. Incubate the mixture for 15 min at 12 °C using a thermal
cycler.
3. Add 1.1 μL of 0.5 M EDTA to stop the reaction.
4. Add 90 μL of 1× TE.
5. Purify DNA fragments by phenol/chloroform extraction and
ethanol precipitation.
ChIP-on-Chip Analysis in Arabidopsis
417
6. Briefly air dry DNA fragments.
7. Dissolve DNA in 16.5 μL of DNase-/RNase-free water.
8. Add 1 μL of T4 polynucleotide kinase, 2.5 μL of 100 mM
ATP, and 5 μL of 5× forward buffer. Gently mix by pipetting.
9. Incubate for 10 min at 37 °C using a thermal cycler.
10. Immediately place the sample tube on ice.
11. Add 175 μL of 1× TE.
12. Purify DNA fragments by phenol/chloroform extraction and
ethanol precipitation.
13. Briefly air dry DNA fragments.
14. Dissolve in 40.75 μL of DNase-/RNase-free water.
15. Add 0.25 μL of ExTaq DNA polymerase, 0.4 μL of 100 mM
dATP, and 5 μL of 10× ExTaq PCR buffer.
16. Incubate for 30 s at 50 °C then for 20 min at 72 °C using a
thermal cycler.
17. Immediately place the sample tube on ice.
18. Purify DNA fragments using QiaAmp DNA micro kit (Qiagen)
according to the manufacturer’s instructions. Elute with 30 μL
of DNase-/RNase-free water.
19. Check the DNA concentration by measuring the absorbance.
20. Gently mix 200 ng of DNA with 1 μL of 100 μM T/A dsDNA
adaptor (see Note 2), 1 μL of T4 DNA ligase, 1.5 μL of 10×
ligation buffer, and DNase-/RNase-free water in a total volume of 15 μL.
21. Incubate overnight at 16 °C on a thermal cycler.
22. Purify and elute the adaptor-ligated DNA fragments and check
the DNA concentration using the same procedure described in
step 19 of this section.
23. Mix 30 ng of the adaptor-ligated DNA fragments with 1 μL of
100 μM T7-c primer, 1 μL of ExTaq DNA polymerase (Takara),
4 μL of dNTP mixture (Takara), 4 μL of 10× ExTaq PCR buffer (Takara), and DNase-/RNase-free water in a total volume
of 50 μL.
24. Amplify the adaptor-ligated DNA fragments by PCR using the
following cycle conditions: 94 °C for 5 min, [94 °C for
30 s,55 °C for 30 s, 72 °C for 90 s] × 15 cycles, and 72 °C for
4 min and store at 4 °C (see Note 24).
25. Purify DNA using Qiagen PCR purification kit according to
the manufacturer’s instructions. Elute with 30 μL of DNase-/
RNase-free water.
26. Check the DNA concentration by measuring the absorbance.
27. Also check the size of the amplified DNA fragments using the
Agilent 2100 Bioanalyzer (Agilent Technologies) (see Note 25).
418
Jong-Myong Kim et al.
3.7 Synthesis of
Biotin-Labeled cRNA
Using the IVT Reaction
1. Transfer 200 ng of the amplified DNA sample to a RNase-free
microfuge tube and add 4 μL of 10× IVT labeling buffer, 12 μL
of IVT labeling NTP mix, and 4 μL of IVT labeling enzyme
mix. Adjust to a final volume of 40 μL with RNase-free water.
2. Mix gently and spin down to collect the solution.
3. Incubate at 37 °C for 18 h in an air incubator (see Note 26).
4. For the cleanup of biotin-labeled cRNA (see Note 27), add
60 μL of RNase-free water to the IVT reaction mixture sample
(after step 3) and vortex for 3 s.
5. Add 350 μL of IVT cRNA binding buffer (see Note 28) to the
sample and mix by vortexing for 3 s.
6. Add 250 μL of ethanol and mix well by pipetting (see Note 29).
7. Apply 700 μL of the sample onto “IVT cRNA cleanup spin
column” set in a 2 mL collection tube. Centrifuge for 15 s at
6,000 × g. Discard the flow-through and the collection tube.
8. Transfer the spin column onto a new 2 mL collection tube. Apply
500 μL of “IVT cRNA wash buffer” onto the spin column.
Centrifuge for 15 s at 6,000 × g. Discard the flow-through.
9. Apply 500 μL of 80 % ethanol onto the spin column. Centrifuge
for 15 s at 6,000 × g. Discard the flow-through.
10. Open the cap of the spin column and centrifuge for 5 min at
20,000 × g. Discard the flow-through and the collection tube.
11. Transfer the spin column onto a 1.5 mL collection tube and
apply 11 μL of RNase-free water onto the membrane of the
spin column. Subsequently centrifuge for 1 min at 20,000 × g
to elute the cRNA.
12. Apply 10 μL of RNase-free water onto the membrane of the
spin column. Then centrifuge for 1 min at 20,000 × g and collect the eluate.
13. Check the concentration of the biotin-labeled cRNAs by measuring the absorbance (see Note 30).
3.8 Fragmentation
of the cRNA
1. Prepare the fragmentation buffer containing 45 μg of cRNA
(1–21 μL) and 8 μL of 5× fragmentation buffer in a 0.2 mL
tube. Adjust to a final volume of 40 μL with DNase-/RNasefree water.
2. Incubate at 94 °C for 35 min using a thermal cycler. Place on
ice immediately after the incubation.
3. Check the fragmentation with an Agilent 2100 Bioanalyzer
(see Note 31).
3.9
Hybridization
1. Incubate 20× eukaryotic hybridization controls for 5 min at
65 °C to completely dissolve the elements.
ChIP-on-Chip Analysis in Arabidopsis
419
2. Prepare the hybridization cocktail. For each target sample,
add the following reagents to 15 μg of each fragmented
cRNA sample: 5 μL of 3 nM control Oligo B2, 15 μL of 20×
eukaryotic hybridization controls, 3 μL of 10 mg/mL
Herring Sperm DNA, 3 μL of 50 mg/mL BSA, 150 μL of 2×
hybridization buffer, and 30 μL of DMSO. Adjust to a final
volume of 300 μL with DNase-/RNase-free water.
3. Maintain the tiling array at room temperature (see Note 32).
4. Prehybridize the array by filling through a septum with 200 μL
of 1× hybridization buffer using a micropipettor (see Note 33)
and incubate the array for 10 min at 45 °C with rotation.
5. Heat the hybridization cocktail at 99 °C for 5 min on a heat
block.
6. Transfer the hybridization cocktail to 45 °C on heat block and
keep for 5 min.
7. Centrifuge the hybridization cocktail at 20,000 × g for 5 min to
remove any insoluble materials.
8. Remove the pre-hybridization buffer solution from the array
and add 200 μL of the hybridization cocktail (see Note 34)
onto the array.
9. Incubate the array for 18 h at 45 °C with 60 rpm rotation in
the hybridization oven.
3.10 Washing
and Staining
1. For each target sample, prepare three tubes for streptavidin
phycoerythrin (SAPE) solution for the first stain, antibody
solution, and SAPE solution for the third stain. For each sample, prepare 1,200 μL of SAPE solution mix containing 600 μL
of 2× stain buffer, 48 μL of 50 mg/mL BSA, 12 μL of 1 mg/
mL SAPE, and 540 μL of DNase-/RNase-free water. Divide it
into two aliquots of 600 μL which are used for the first stain
solution and the third stain solution (see Note 35).
2. For each sample, prepare 600 μL of the antibody solution mix
containing 300 μL of 2× stain buffer, 24 μL of 50 mg/mL
BSA (see Note 36), 6 μL of 10 mg/mL goat IgG stock, 3.6 μL
of 0.5 mg/mL biotinylated antibody, and 266.4 μL of DNase-/
RNase-free water.
3. After 18 h of hybridization, remove the hybridization cocktail
from the array (see Note 37) and completely fill the array with
the appropriate volume (about 250 μL) of non-stringent wash
buffer A.
4. Set the wash buffer A and wash buffer B into the fluidics station. Run the protocol “Prime_450.”
5. Set the SAPE solution and antibody solution into the fluidics
station.
420
Jong-Myong Kim et al.
6. Select the protocol “EuKGE-ws2v4” in the fluidics station.
Insert the array into the designated module of the fluidics station and start the run (see Note 38). Perform washing and
staining procedure as follows:
(a) Post-hyb wash #1: 10 cycles of 2 mixes/cycle with wash
buffer A at 30 °C.
(b) Post-hyb wash #2: 4 cycles of 15 mixes/cycle with wash
buffer B at 50 °C.
(c) Stain: Stain the array for 10 min in SAPE solution at
35 °C.
(d) Post stain wash: 10 cycles of 4 mixes/cycle with wash buffer A at 30 °C.
(e) Second stain: Stain the array for 10 min in antibody solution at 35 °C.
(f) Third stain: Stain the array for 10 min in SAPE solution at
35 °C.
(g) Final wash: 15 cycles of 4 mixes/cycle with wash buffer A
at 35 °C. The loading temperature is 25 °C.
7. Turn on the scanner approximately 30 min prior to the end of
the protocol (see Note 39). One hour and 20 min after starting the run, the “Eject” sign will appear. Remove the array at
this time (see Note 40).
3.11
Array Scanning
1. On the back of the array, wipe off excess solution around the
septum. Cover the septum with the seal “Tough-Spots” and
keep the surface of the seal flat (see Note 41).
2. Perform scanning using filters (570 nm) at 0.7 μm resolution
using a GeneChip Scanner 3000 7G. When entering the experimental information using GCOS (GeneChip Operating
Software) ver. 1.3, select “At35b_MF_v04” for 1.0F Array.
3.12 Computational
Analyses of ChIPon-Chip Data
1. Prepare the information of Arabidopsis genome sequence and
annotation from Arabidopsis genome release (ftp://ftp.arabidopsis.org/home/tair/Genes/TAIR∗∗_genome_release/; see
Note 42) in the Arabidopsis information resource (TAIR).
2. Map the probes of each Affymetrix Arabidopsis whole-genome
tiling array (1.0F Array) on the Arabidopsis genomic sequence.
3. For the analysis of protein enrichments in the Arabidopsis
whole genome, normalize the intensity of a total of 6.4 million
25 nt oligonucleotide probes for one strand of genomic
sequence (corresponding to 3.2 million perfect match (PM)
and 3.2 million mismatch (MM) probes) of individual replicates for all samples at the same time via quantile normalization [23] (see Notes 43–45).
ChIP-on-Chip Analysis in Arabidopsis
421
4. Calculate the signal intensity and genomic positions using the
MAS5.0 algorithm (Affymetrix).
5. Normalize the data between ChIPed DNA and Input DNA
using the rank consistency filter selects representative probes
whose order of intensity is stable between two experiments (see
Note 46).
6. Analyze the enrichment value of histone H4 tetra-acetylation
at the genome-wide level.
4
Notes
1. It is necessary to set the preferred genes as internal controls
for the ChIP assay to detect the target protein enrichments.
To determine the enrichment of histone H4 tetra-acetylation,
we utilize the ACT7 region as an internal control of the ChIP
assay.
2. The T/A dsDNA adaptor (Fig. 3) is designed to increase the efficiency of ligation using the T/A ligation method and to amplify
the fragments by the in vitro transcription (IVT) system using T7
RNA polymerase [24]. Forward and reverse strand oligos are very
slowly annealed in vitro to make dsDNA. The annealed dsDNA is
then purified by PAGE gel extraction of the band, derived from
the accurately annealed dsDNA. We have ordered the annealing
and purification of dsDNA through Sigma-Aldrich Japan. The
quality of purified dsDNA directly affects the subsequent efficiency of ligation and the IVT reaction.
3. Information of the tilling array platform can be found at the
Gene Expression Omnibus (GEO) at NCBI (http://www.
ncbi.nlm.nih.gov/geo/). The 25 nt oligonucleotides chosen
from the reverse strand genomic sequence are comprised in the
1.0F array, and the sequence information is the array platform
GPL1980. Each Affymetrix Arabidopsis whole-genome tiling
array (1.0F Array) contains 6.4 million 25 nt oligonucleotide
probes [18]. The tiling arrays are comprised of 3.2 million perfect match (PM) probes that perfectly match genomic sequence
and 3.2 million mismatch (MM) probes whose central base
(positions 13 of 25) is substituted by its complement.
4. The head of the tip is ultrathin.
Fig. 3 T/A dsDNA adaptor (see Note 2)
422
Jong-Myong Kim et al.
5. To detect well-enriched signals of in the ChIP assay and to prevent variation between each experiment, plant samples (at least
1 g fresh weight) should be used for sampling. Moreover,
freeze-thaw steps should be avoided as much as possible because
very weak direct protein binding and indirect protein interactions are dissociated as a result of freeze-thaw treatment.
6. To prevent plants from escaping or sticking to the sides of the
beaker due to bubbling of the fixation buffer, carefully control
the speed for the formation of the vacuum by using the threeway valve connected to the vacuum pump.
7. Well-fixed plants will sink to the bottom of the beaker and they
will become darker in color.
8. Well-fixed plants should appear “crispy.”
9. It is necessary to prechill stainless steel tubes, tungsten balls,
the aluminum tube holder, and 50 mM HEPES buffer on ice
before use.
10. To prevent the blowing up the samples, the O-ring on the
metal lid should be changed prior to every experiment and air
bubbles should be eliminated from the parafilm sealing on the
top of the tubes.
11. Minimum sample volume for effective sonication is 15 mL. If
the sample volume is less than 15 mL, foaming will occur during the sonication procedure.
12. Tubes must be kept on ice water.
13. During sonication, samples should be kept in ice water to prevent the warming and foaming of samples.
14. The range of fragment size should be between 150 and 500 bp,
peaking at around 250–300 bp. However, minuscule amounts of
longer size fragments, ordinarily up to 1,000 bp, are produced.
15. It is necessary to wash magnetic beads with 150 mM lysis buffer three times just before use.
16. The amount of antibody added to detect protein–chromatin
interactions are dependent on the antibody titer. The titer for
each antibody should be checked and the amount added
should be optimized by the ChIP PCR assay.
17. Follow the kit instructions. The use of other DNA purification
columns (e.g., Qiagen DNA MiniElute column) in this step is
not recommended because the recovery efficiency of low concentrations of Arabidopsis genomic DNA fragments is poor
using other kits.
18. DNase-/RNase-free water is used to elute purified DNA from
the column. The TE or AE buffer that is provided with the kit
for elution is then used. The carry-over of excess salts inhibits
the small-scale reactions in the PCR and adaptor ligation.
ChIP-on-Chip Analysis in Arabidopsis
423
19. To evaluate the efficiency of ChIP, primers to amplify a region
of DNA that is known to be enriched in the sample should be
designed as follows: Tm 58–62 °C, nucleotide length ~25 nt,
PCR product length 100–250 bp. The sequence should be
specific to the target region of interest. ACT7 region is used as
an internal control for multiplex PCR to analyze enrichment of
histone H4 acetylation.
20. It is recommended to limit the number of PCR cycles less than
27 to guarantee the reliability of quantification.
21. It is necessary to adjust the volume of PCR products applied to
the gel to detect unsaturated bands.
22. Detect the densities of each band using the “Histogram Peak”
measurement detection tool in the ImageQuant software.
Calculate the ratio of enrichment using the following formula:
ratio of enrichment = {(value of band density of a target region
in ChIPed DNA)/(value of band density of ACT7 region in
ChIPed DNA)}/{(value of band density of a target region
in Input DNA)/(value of band density of ACT7 region in
Input DNA)}.
23. ChIPed DNA prepared after this steps can also be used as template DNA for ChIP-seq analysis. To amplify the template
DNA for ChIP-seq analysis, it is recommended to use optimized amplification and sequencing procedures provided from
the manufacturer of each high-throughput sequencer.
24. To guarantee linear PCR amplification, it is recommended to
limit the number of PCR cycles to less than 15.
25. The main peak size of precisely amplified DNA fragments is
shifted from 250–300 to 320–370 bp.
26. If the biotin-labeled cRNAs are not immediately used for
cleanup, store them at −20 or −70 °C.
27. Perform the cleanup of the biotin-labeled cRNAs at room
temperature.
28. If precipitates are formed, the IVT cRNA binding buffer
should be warmed to 30 °C and then maintained at room temperature prior to use.
29. Do not centrifuge the samples after mixing.
30. More than 30 μg of the biotinylated cRNAs should be generated. The ratio (A260/A280) should be between 1.9 and 2.1.
31. RNA fragment size should range from 35 to 200 bp. Store the
fragmented cRNA samples at −70 °C before use for
hybridization.
32. Immediately after the tiling array is returned from storage at
low temperature to room temperature, the rubber of the septa
is hard and can easily crack.
424
Jong-Myong Kim et al.
33. Use the pipetman tip “HR-250S” for pushing the septa and
filling hybridization buffer through the septum. Note that
cracking of the septum causes the deposition of hybridization
buffer.
34. Do not use the insoluble materials at the bottom of the tube.
Do not add bubbles onto the array.
35. Thoroughly mix the SAPE solution by tapping before use.
36. For BSA, IgG, and antibody stocks, centrifuge the solution
and use the supernatant for preparation of the antibody solution mix.
37. If the volume of the recovered hybridization cocktail is less
than 170 μL, the center part of the array might not be filled
with the cocktail.
38. Be sure to check that the buffer runs up and down. If the
bubbles stay at the same position, stop the run and manually
refill the array with wash buffer A. When the wrong buffer is
used, the run stops.
39. The scanner should be warmed up at least 15 min prior to
scanning.
40. Be sure to check whether bubbles stay on the array or not. If
bubbles stay on the array, the array should be reset into the
cartridge holder and then resubjected to washing and staining.
However, excess washing causes a loss of signal intensity for
each probe. After washing, the array should be immediately
subjected to scanning. The remaining array should be kept in
the dark at room temperature before scanning.
41. This step is done to prevent leakage of the solution during the
scanning procedure.
42. Use of the latest version on the Arabidopsis genome annotation
is recommended. The latest version is TAIR10 (ftp://ftp.arabidopsis.org/home/tair/Genes/TAIR10_genome_release/).
43. Since our preliminary analyses using the intensities of (PMMM) generated better results for the identification of stressresponsive genes than that using only PM intensities, we used
the intensities of (PM-MM) for the analyses [25].
44. In our tiling array analysis, the following probes were excluded
from the data analysis: (1) the PM probes which perfectly
matched more than two positions and (2) the MM probes
which perfectly matched the positions different from its original ones.
45. After the quantile normalization, the intensities of all replicates
representing different samples reach a common median. All
normalized intensities for each expressed spot are then averaged among all the replicates of the same sample to obtain a
single statistic value.
ChIP-on-Chip Analysis in Arabidopsis
425
46. We only analyzed the probes in the 90th percentile because
probes having dark (no signal) and saturated signals should not
be counted even if they have a consistent order. To detect and
visualize the ChIP-on-chip results, we applied smoothing
using a Parzen window function having a 250 bp width. The
width setting depends on binding features of the target
proteins.
Acknowledgements
This research was supported by The Grant-in-Aid for Scientific
Research (Priority Areas no. 20127033 and 23012036; Innovative
Areas 23119522) from the Ministry of Education, Culture, Sports,
Science and Technology (MEXT) of Japan (to MS) and grants
from the RIKEN Plant Science Center (to MS).
References
1. Wolffe AP (1998) Packaging principle: how
DNA methylation and histone acetylation
control the transcriptional activity of chromatin.
J Exp Zool 282:239–244
2. Jenuwein T, Allis CD (2001) Translating the
histone code. Science 293:683–692
3. Kurdistani SK, Grunstein M (2003) Histone
acetylation and deacetylation in yeast. Nat Rev
Mol Cell Biol 4:276–284
4. Nightingale KP, O’Neill LP, Turner BM (2006)
Histone modifications: signaling receptors and
potential elements of a heritable epigenetic
code. Curr Opin Genet Dev 16:125–136
5. Kouzarides T (2007) Chromatin modification
and their function. Cell 128:693–705
6. Bhaumik SR, Smith E, Shilatifard A (2007)
Covalent modifications of histones during
development and disease pathogenesis. Nat
Struct Mol Biol 14:1008–1016
7. Bártová E et al (2008) Histone modifications
and nuclear architecture: a review. J Histochem
Cytochem 56:711–721
8. Pfluger J, Wagner D (2007) Histone modifications and dynamic regulation of genome accessibility in plants. Curr Opin Plant Biol 10:
645–652
9. To TK et al (2011) Arabidopsis HDA6 is
required for freezing tolerance. Biochem
Biophys Res Commun 406:414–419
10. Sokol A et al (2007) Up-regulation of stressinducible genes in tobacco anad Arabidopsis cells
in response to abiotic stresses and ABA treatment
correlates with dynamic changes in histone H3
and H4 modifications. Planta 227: 245–254
11. Kim JM et al (2008) Alterations of lysine modifications on the histone H3 N-tail under
drought stress conditions in Arabidopsis thaliana. Plant Cell Physiol 49:1580–1588
12. Kim JM et al (2010) Chromatin regulation
function in plant abiotic stress responses. Plant
Cell Environ 33:604–611
13. Kwon CS et al (2009) Histone occupancydependent removal of H3K27 trimethylation
at cold-responsive genes in Arabidopsis. Plant J
60:112–121
14. Katou Y et al (2003) S-phase checkpoint proteins
Tof1 and Mrc1 form a stable replication-pausing
complex. Nature 424:1078–1083
15. Cawley S et al (2004) Unbiased mapping of
transcription factor binding sites along human
chromosomes 21 and 22 points to widespread
regulation of noncoding RNAs. Cell 116:
499–509
16. Katou Y et al (2006) Genomic approach for the
understanding of dynamic aspect of chromosome
behavior. Methods Enzymol 409:389–410
17. Lee TL, Johnstone SE, Young RA (2006)
Chromatin immunoprecipitation and microarray-based analysis of protein location. Nat
Protoc 1:729–748
18. Zhang X et al (2006) Genome-wide highresolution mapping and functional analysis of
DNA methylation in Arabidopsis. Cell 126:
1189–1201
19. Zilberman D et al (2006) Genome-wide analysis of Arabidopsis thaliana DNA methylation
uncovers an interdependence between methylation and transcription. Nat Genet 39:61–69
426
Jong-Myong Kim et al.
20. Zhang X et al (2007) The Arabidopsis LHP1
protein colocalizes with histone H3 Lys27
trimethylation. Nat Struct Mol Biol
14:869–871
21. Lee J et al (2007) Analysis of transcription factor HY5 genomic binding sites revealed its
hierarchical role in light regulation of development. Plant Cell 19:731–749
22. Morohashi K, Grotewold E (2009) A systems
approach reveals regulatory circuitry for
Arabidopsis trichome initiation by the GL3 and
GL1 selectors. PLoS Genet 5:e1000396
23. Bolstad BM et al (2003) A comparison of normalization methods for high density oligonucleotide array data based on variance and bias.
Bioinformatics 19:185–193
24. Liu CL, Schreiber SL, Bernstein BE (2003)
Development and validation of a T7 based linear amplification for genomic DNA. BMC
Genomics 4:19
25. Matsui A et al (2008) Arabidopsis transcriptome analysis under drought, cold, high-salinity and ABA treatment conditions using tiling
array. Plant Cell Physiol 49:1135–1149
Part V
Cell Biological Techniques
Chapter 23
Fluorescence Microscopy
Sébastien Peter, Klaus Harter, and Frank Schleifenbaum
Abstract
Optical microscopy has developed as an indispensable tool for Arabidopsis cell biology. This is due to the
high sensitivity, good spatial resolution, minimal invasiveness, and availability of autofluorescent proteins,
which can be specifically fused to a distinct protein of interest. In this chapter, we introduce the theoretical
concepts of fluorescence emission necessary to accomplish quantitative and functional cell biology using
optical microscopy. The main focus lies on spectroscopic techniques, which, in addition to intensity-based
studies, provide functional insight into cellular processes.
Key words Fluorescence microscopy, Spectromicroscopy, FRET, Autofluorescent proteins,
Fluorescence sensors
1
Introduction
Modern plant science research on systems such as Arabidopsis
aspires to a precise understanding of molecular processes controlling the function of the plant at a subcellular level. To this end,
several diverse techniques are available, ranging from genetics
through biochemical approaches to electron microscopy with
atomic resolution. In spite of the huge potential of these methods,
they drastically influence the native functionality of a living system
and, thus, real in vivo studies are difficult to obtain with these techniques. This is where optical microscopy comes into play. Being
noninvasive techniques, which are also applicable to cells in their
native tissue, optical approaches allow the undisturbed observation
of cellular function [1]. With a diffraction-limited spatial resolution of around 200 nm, access to subcellular structures is possible
and different functional compartments inside a cell can be distinguished. Moreover, the information content achievable from an
optical measurement can be drastically enhanced when fluorescence emission is used to create a microscopy image. This way, only
distinct areas of a sample, which host specific fluorescent dyes
become visible. This chapter focuses on fluorescence microscopy
Jose J. Sanchez-Serrano and Julio Salinas (eds.), Arabidopsis Protocols, Methods in Molecular Biology, vol. 1062,
DOI 10.1007/978-1-62703-580-4_23, © Springer Science+Business Media New York 2014
429
430
Sébastien Peter et al.
and presents different readout modes and specific notes on
experimental parameters.
Fluorescence emission differs from other optical techniques by
the red-shifted emission of a fluorescence dye relative to the excitation wavelength [2]. This way, fluorescence emission principally
occurs in front of a zero-background. Using designated optical
filters, only light actively emitted by a fluorophore contributes to
the microscopy image. This instance makes fluorescence microscopy one of the most sensitive techniques known so far, which
even allows for the observation of single isolated molecules [3]. To
exploit the full potential of fluorescence microscopy, it is essential
to understand the origin of fluorescence emission. In the following, a short introduction into this field is provided. However, we
suggest more specialized literature for further reading [2, 4–7].
Fluorescence emission is due to an electronic quantum transition in an electronically excited molecule. The principal processes
can be summarized in a Jablonski diagram as depicted in Fig. 1.
According to Boltzmann’s statistics a molecule is in its electronic (S0) as well as in its vibronic ground state at room temperatures in good approximation. If this molecule is interacting with
electromagnetic irradiation with an energy matching the energy
gap between the electronic ground state and some vibronic states
of the first electronically excited state (S1), the molecule undergoes
a transition between these states within a few femtoseconds
(absorption). After this excitation, the molecule will lose some of
the excitation energy thermically by vibration and collision with
adjacent molecules. This effect which commonly is referred to as
thermal equilibration (TE) occurs on a sub-picosecond timescale.
After TE, the molecule is trapped in the vibronic ground state of
the electronically excited state for a certain time until the further
relaxation into the electronic ground state occurs. Here, one has to
distinguish two competing mechanisms, a non-radiative, which is
not accompanied with light emission, and a radiative, which is
commonly referred to as fluorescence. The probability of these
relaxation processes to occur can be expressed by the overall relaxation rate G = G rad + G nonrad. The radiative rate G rad expresses the
Fig. 1 Jablonski diagram for a schematic illustration of quantum transition during fluorescence emission
Fluorescence Microscopy
431
fluorescence photon flux per time interval and its reciprocal value
t = 1 / G rad represents the typical time span a molecule is trapped
in the electronically excited state before fluorescence occurs. This
so-­called fluorescence lifetime (FLT) is an important spectroscopic
parameter, which provides valuable insight into functional cell
biology, as will be discussed later on. The radiative transition can
reach any vibronically excited level of the S0-state, which gives rise
to the shape of fluorescence spectra, which do not consist of a
single line, but are rather composed of a number of broad bands.
Each of these bands corresponds to a transition to a vibronic state
from which, in turn, non-radiative TE occurs. Albeit rather broad,
the fluorescence spectrum of a given molecule is characteristic and
can be used to identify a distinct emitter among others. Moreover,
fluorescence emission is not a mere property of the fluorescence
dye but is also influenced by its local chemical nano-environment.
As a consequence, the fluorescence spectroscopic information can
be used as a probe for the local surrounding of a fluorophore and
can, hence, provide information about distinct changes in physicochemical parameters, such as the pH or the redox potential. Before
these options are discussed, a principal introduction into modern
fluorescence microscopes is provided [8]. Note that, in addition to
the processes discussed above, the molecule can also undergo a
transition to a different electron spin configuration, the triplet
state, via intersystem crossing (ISC). From this state, delayed emission can occur, which commonly is referred to as phosphorescence.
However, as phosphorescence does not occur significantly for fluorescence dyes used in cell biology, this effect will not be treated
further.
2
Confocal Microscopy
FLT and fluorescence spectra are the most prominent spectroscopic characteristics, which can be read out with spatial resolution
in addition to a fluorescence intensity image. Therefore, it is crucial to record the spectroscopic information in a very-well-defined
spatial area and to exclude cross talk from other regions. Confocal
laser scanning microscopy (CLSM) is one very prominent approach
to reach this goal [1]. The basic principle of CLSM, which was
invented by Marvin Minsky in 1957, is straightforward and consists of three confocal spots for the (1) excitation, (2) light collection, and (3) detection, as depicted schematically in Fig. 2a [9].
Due to the confocal arrangement, only one highly confined sample
area, the focal spot, is irradiated and the fluorescence light is only
collected from this defined area. The third confocal plane, which
typically consists of a pinhole, blocks any light, which does not
originate from the focal spot. Moreover, contrarily to a conventional microscope, a confocal image is confined to a well-defined
432
Sébastien Peter et al.
Fig. 2 (a) Schema of confocal principle. The indices i, ii, and iii refer to the respective confocal focusing elements. (b) Confocal beam path using one focusing element for both excitation and collection. (c) Scheme of a
typical confocal setup
image plane. Light, which can pass the pinhole in x- and y-­direction
but which is not tightly focused in z-direction (see Fig. 2c for axis
assignment), will result in a broadened spot in the pinhole plane.
Accordingly, the intensity is spread over a broader area and only a
small fraction is directed to the detector. These considerations
yield the basic concept of a point-to-point imaging. This, in turn,
requires raster scanning of the focal spot relative to the sample and
rearrangement of the intensity information for any image point
(i.e., pixel) to obtain a microscopy image.
The schema depicted in Fig. 2a is not very convenient, though.
This is mainly because the maximum thickness of the sample is
limited and alignment is difficult because two foci from different
lenses have to be aligned to exactly the same spot. Hence, typical
confocal configurations use the same lens for both excitation and
detection. This is achieved by introducing a dichroic beam-splitter
into the beam path as shown in Fig. 2b. This component reflects
light below a certain cutoff wavelength and transmits radiation of
longer wavelengths. This way, the excitation light is effectively
directed onto the sample, while the fluorescence light can pass to
the detector without disturbance. Actual high-end confocal setup
uses acousto-optical beam splitters instead of dichroic mirrors.
Offering basically the same functionality, they are highly flexible in
varying the cutoff wavelength and can electronically be adjusted to
any fluorescent dye system without the need of changing optical
parts. Figure 2c shows a scheme of a typical confocal setup,
equipped with a spectrally integrating detector and an additional
spectrometer attached to a CCD camera, which allows the confocal
acquisition of fluorescence spectra. As the dichroic beam splitter
Fluorescence Microscopy
433
has a blocking efficiency, which is too low to restrain all excitation
light reflected from the sample, additional long-pass or band-pass
filters are introduced in the detection beam path.
One important parameter in basically any microscopy application is the spatial resolution Δd. For optical microscopy, the resolution is limited due to diffraction of light waves in the focal spot
according to
Dd =
l
l
=
2·n·sin q 2·NA
(1)
with θ being the half opening angle of the focusing lens and n the
refractive index of the medium between lens and sample [10]. For
convenience, the product of n and sin θ is often written as the
numerical aperture (NA) of the focusing element. Equation 1
exhibits that the optical resolution is physically limited and directly
depends on the wavelength. Moreover, the NA also limits the spatial resolution. As a consequence, in confocal microscopy typically
microscope objectives with a high numerical aperture are used as
focusing elements. Theoretically, the maximum value for the NA
would be 1 in air, but such opening angles of 180° cannot be
achieved with lens systems. However, the NA can effectively be
enhanced by introducing immersion liquids. These are substances,
which are highly transparent in the optical spectral regime to
provide maximum transmission for fluorescence light and offer a
refractive index significantly higher than 1. Typical immersion
liquids are water (n = 1.33) or specific immersion oils (n = 1.51).
This way, microscope objectives with NAs of 1.35 with a magnification factor of 100 are available and even objectives with
NA = 1.46 are used for some special applications [9, 11]. See Note
7 for a short guide to selcet a suitable objective for a distinct
experiment. Using Eq. 1, one obtains a maximal resolution of
Dd = 185nm for blue light (500 nm) and an objective with
NA = 1.35. It is important to note that the magnification of an
objective does not directly influence the obtainable spatial resolution. For a lower magnification, the confocal pinhole size, which
has to match the diameter of the focus Dd , is just larger. Small
magnification factors are avoided, because a lower magnification is
typically accompanied with a larger working distance between
objective and sample, which translates into a smaller NA.
Recent efforts have been made to circumvent the diffraction-­
limited optical resolution and different methods such as photoactivated localization microscopy [12–15] [PALM, also referred to as
stochastic optical reconstruction microscopy (STORM)], stimulated emission depletion (STED) microscopy [16, 17], or super-­
resolution optical fluctuation imaging (SOFI) have been established
[18]. These techniques are very powerful and lift optical microscopy close to a molecular level. Yet, there is hardly any application
434
Sébastien Peter et al.
to plant cells so far, mainly due to some properties of plant cells
such as the strong autofluorescence, which hamper the detection
of single emitters inherent in PALM/STORM or depose too much
energy due to the high laser powers required in STED. Hence,
these techniques are not discussed in this chapter and we refer to
more specialized literature.
3
Fluorescence ReadOut Modes
Besides the intensity information, the fluorescence emission also carries information about the local environment of the fluorophore [8].
In the following section, the nature of this information, and how it
can be measured, is introduced. The determination of the FLT τ is
of special interest [4, 19]. This value represents the probability of
a dye molecule to emit a photon in a distinct time window after
excitation. Any disturbance of the excited state of the molecule will
result in a change of this probability [2]. The origins of these disturbances are manifold, ranging from mechanical stress to changes
in the refractive index, the local pH value, or electric field [20–22].
However, not all fluorescence dyes exhibit the same sensitivity to
changes of these parameters and, thus, the FLT can be used as a
valuable local probe.
The decay of the excited electronic state is significantly slower
than processes like the thermal equilibration, but is still typically in
the nanosecond time range. Hence, fairly sophisticated data acquisition techniques and electronics are required. The most prominent approach to determine τ, which is utilized in the vast majority
of confocal fluorescence microscopes, relies on the statistical analysis of the photon arrival times. This time-correlated single-photon
counting (TCSPC) uses a short laser-pulse with pulse lengths in
the picosecond range well below the FLT to locally excite the sample. Synchronously, an electronic stop watch is started, which is
stopped by the first fluorescence photon detected by a highly sensitive detector such as an avalanche photodiode (APD) or a photon
multiplier tube (PMT) operated in photon counting mode. The
arrival time is translated into a discrete time value by an analogueto-­digital converter and histogramed. The procedure is repeated
for several thousands of times, resulting in an intensity decay time
histogram [23]. As the histogram describes the probability of the
metastable excited state to decay into the ground state, it can be
mathematically treated as a radioactive decay or as a chemical reaction with first-order kinetics. Hence, the time evolution of the
fluorescence intensity I(t) obeys an exponential decay with a time
constant τ, defined as the time span after that the initial intensity
I 0 has dropped to 1/e. Accordingly, the fluorescence decay histogram can be described by
Fluorescence Microscopy
æ tö
I (t ) = I 0 · exp ç - ÷
è tø
435
(2)
Equation 2 can be rearranged to read t = -t ·ln(I 0 / I (t )) and
the FLT can directly be extracted. The knowledge of the FLT can
give useful information about the local environment of the chromophore under investigation. See Notes 4–6 for experimental
settings and possible pitfalls in FLT-measurements.
Fluorescence microscopy of Arabidopsis and other plant cells
generally suffers from autofluorescence background. The strong
emission from chloroplasts is found exclusively in the red spectral
region and can, hence, be filtered out by short-pass or band-pass
filters with acceptable spectral bleed-through. In contrast, unspecific emission from other compartments and especially from the
cell wall exhibits a strong spectral and temporal overlap with the
emission properties of typical fluorescence dyes. As a consequence,
this autofluorescence contribution cannot easily be filtered out
with conventional methods. Hence, different approaches have to
be invented to discriminate the autofluorescence from the specific
label signal. One very robust approach utilizes the statistics of the
species contributing to a local fluorescence decay recorded with a
standard TCSPC-FLIM setup. This fluorescence intensity decay
shape analysis microscopy (FIDSAM) offers a robust means to discriminate background from target emission [24–26]. To this end,
the shape of the fluorescence decay is compared to a reference
function, i.e., a monoexponential fit function. In case of only pure
label dye contributing to the measured fluorescence, the decay
signal can be well described by the reference function and the
resulting error value, representing the deviating from the fitted to
experimental curve, is small. Contrarily, autofluorescent tissue consists of a multitude of unspecific emitters, each of which exhibiting
its individual fluorescence decay statistics. As a consequence, the
recorded fluorescence decay represents the sum of a large number
of decay statistics and, thus, becomes multiexponential. Using the
reference function to describe this multiexponential decay will
result in relatively large error values. Obviously, the error values
represent a quantitative number to determine the autofluorescence
contribution to a signal recorded with spatial resolution. Hence,
multiplication of the original intensity value of an image pixel with
the inverse error value will cause this pixel to diminish in a
FIDSAM-corrected image in dependence of the autofluorescence
contribution.
The most prominent feature of the FIDSAM technique is its
robustness and applicability to basically any label dye without any
presumptions. The basic concept rests on the valid assumption that
the brightness of a well-fashioned dye molecule, which is the product of the extinction coefficient at a given excitation wavelength
436
Sébastien Peter et al.
and its fluorescence quantum yield, exceeds the brightness of an
arbitrary autofluorescent biomolecule. Accordingly, the impact of
a single molecule contributing to the autofluorescence background
is small compared to a fluorescence dye. This way, the relative contribution to the local fluorescence decay is higher for (bright) fluorescence dyes and, hence, as soon as these compounds contribute
to a measured fluorescence decay, they dominate the shape of the
decay curve.
3.1 Optical Protein–
Protein Interaction
Studies
Besides localization studies, which clarify the appearance of a distinct protein of interest in a certain cellular compartment, the investigation of the interaction of two or more proteins is a major field
in modern cell and molecular biology. A general concept incorporates protein–protein interaction as one of the major players in signal transduction, regulation of enzymatic activities, gene regulation,
etc. Protein–protein interactions can initiate different phosphorylation states, can block distinct binding sites in a competitive fashion
or in a noncompetitive way, or can directly influence transcription
factors and, hence, regulation of gene expression [27].
The identification and quantification of protein–protein interactions in the living cell context are, therefore, of high interest.
Unfortunately, mere imaging of protein distribution cannot lead
to reasonable results due to the limited spatial resolution, which is
restricted to ~200 nm. Nevertheless, optical fluorescence microscopy offers two concepts which circumvent this restriction and
combine the advantages of optical microscopy, such as non-­
invasiveness and dynamical readouts, with the possibility to identify molecular interaction on a nanometer scale.
3.1.1 Fluorescence
Resonance Energy Transfer
Fluorescence resonance energy transfer (FRET), which was for the
first time described by Theodor Förster in 1948 [28], exploits the
distance dependence of the electromagnetic coupling of two dye
molecules in the optical near field. The molecular basis for FRET
to occur requires a pair of dyes consisting of a “donor” and an
“acceptor.” The dyes are chosen in a way that the absorbance spectrum of the acceptor reasonably overlaps with the fluorescence
emission spectrum of the donor. Given that the dyes are closely
adjacent and properly oriented with respect to each other, energy,
which has been used to excite the donor, can be transferred non-­
radiatively to the acceptor. Due to the postulate of conservation of
energy, this energy transfer causes the donor to be quenched into
the electronic ground state while the acceptor is transferred to the
excited state. The acceptor may return into the ground state in a
conventional manner such as by fluorescence. As will be discussed
later on, there are different techniques to determine the FRET
efficiency based on intensity readouts or using time domain techniques (FRET-FLIM).
Fluorescence Microscopy
437
A theoretical treatment of FRET considers the two dyes as
oscillating dipoles acting as transmitting and receiving antenna [2].
This treatment comprehends a set of equations, which account for
the interchromophoric distance of the two emitters r, the relative
orientation of their transition dipole moments κ2, the fluorescence
quantum yield of the donor chromophore QD, and the overlap
integral of donor emission and acceptor absorption J(λ) to describe
the energy transfer efficiency E according to
E=
R06
R + r6
(3)
6
0
9, 000(ln 10)k 2Q D
with R =
J (l) and J (l) =
128p5Nn 4
6
0
∫
∞
0
FD (l)e (l)l 4d l
∫
∞
0
FD (l)d l
N is Avogadro’s constant and n represents the refractive index.
R06 is a system constant and describes the interchromophoric distance where 50 % of the excitation energy is transferred from the
donor to the acceptor. This parameter is of practical relevance as it
is commonly used to describe a FRET pair. Equation 3 reveals an
inverse sixth power law distance dependence of the FRET efficiency. As a consequence, FRET is most effective for chromophores in close contact and decays typically within 10–15 nm,
depending on the actual dye-system, to less than 1 %. Hence,
FRET can be used as a nano-ruler to single out chromophoric distances and their changes on a length scale two orders of magnitude
smaller than the diffraction-limited optical resolution. Accordingly,
two interacting proteins in close proximity, which are labeled with
an appropriate donor–acceptor pair, will cause FRET to occur,
whereas proteins, which are located in the same compartment but
do not interact, will exhibit much less or rather no FRET activity
[29–32].
Whereas qualitative FRET-studies can distinguish interacting
from noninteracting proteins, quantitative FRET uses the full
potential of the method to determine real interchromophoric distances with nanometer accuracy. For example, different binding
domains, which lead to a different composition of a protein dimer,
can be differentiated. To date, several approaches are known to
determine the FRET efficiency and by that the interchromophoric
distance. In the following, the most common approaches and their
limitations are presented.
The most straightforward approach relies on an intensity-­
based data evaluation, utilizing the quenched donor emission FDA
relative to the donor fluorescence when no acceptor is present FD:
E = 1-
FDA
FD
(4)
438
Sébastien Peter et al.
The main restriction of this approach is that it requires the
knowledge of the unquenched donor emission intensity [2, 33]. This
parameter is not always accessible, especially because intensity studies
obtained from different plant cells are often not comparable. Another
intensity-based approach for FRET detection relies on sensitized
acceptor emission, where the detection channel is chosen to meet the
acceptor emission wavelength. This approach often suffers from cross
talk caused by donor emission, which leaks into the acceptor channel
or directly excites the acceptor. Moreover, a quantification is difficult,
as the absolute acceptor emission intensity for the highest FRET efficiency E = 1 remains unknown. The two intensity-based methods can
significantly be improved when two detectors, matching the emission
of the donor and the acceptor, respectively, are installed. Using this
configuration, the FRET efficiency can directly be determined by a
ratiometric measurement according to
E=
FA
φA
FDA FA
+
φD
φA
(5)
with FA as the acceptor fluorescence intensity and FDA as the donor
intensity. F A and F d represent the fluorescence quantum yields of
the acceptor and the donor, respectively.
However, while this approach is suited for quantitative determination of FRET, it suffers from intrinsic limitations inherent in
intensity-based quantitative analysis of fluorescence studies. The
main problem of this readout modality is uncertainties of absolute
donor or acceptor concentrations. These uncertainties may be
caused by incomplete protein-labeling, e.g., due to imperfect
expression of the fluorescent protein tag or partial degradation of
the fusion protein, or by photobleaching. Moreover, slight misalignments of the optical setup strongly affect the obtained values
of the energy transfer efficiency.
A more sophisticated intensity-based approach to quantitative
FRET relies on gradual acceptor photobleaching, where the acceptor of a FRET pair is bleached by direct resonant excitation and the
recovery of the quenched donor emission is monitored [34]. While
this readout technique works fine on a single-molecule level, where
photobleaching of the acceptor can be precisely visualized, the
application to bulk samples incorporating several FRET-pairs as is
typically the case in functionally labeled biological samples poses
some restrictions. Mainly, it is not trivial to ensure the complete
photobleaching of all acceptors in the detection volume. As a consequence, not all donor emission is recovered and the obtained
values for the FRET efficiency tend to be too low. Accordingly,
there is a strong need of alternative readouts for a quantitative
description of FRET.
Fluorescence Microscopy
439
One prominent technique utilizes time domain spectroscopy
to analyze the fluorescence energy transfer (FRET-FLIM) [30, 31, 35].
Using a TCSPC setup (see Subheading 3), the radiative rates of the
transition from the excited to the ground state of a fluorophore are
investigated. If energy transfer to an acceptor occurs, an additional
relaxation channel for the donor to lose its excitation energy opens.
Accordingly, the radiative transition has to compete with an additional non-radiative pathway, causing the donor FLT to be shifted
to shorter time values. This reduction in the FLT is connected to
the energy transfer efficiency and can be quantified according to
E ET = 1 -
t DA
tD
(6)
with τD as the donor FLT in the absence of an acceptor and τDA as
the donor FLT, when energy transfer can occur.
According to Eq. 6, the time domain approach provides one discrete number, i.e., the quenched donor FLT, to precisely determine
the FRET efficiency. However, also this method requires a careful
data analysis, since the fluorescence intensity decay of the quenched
donor intrinsically obeys a second- or even higher order exponential
decay function. Accordingly, data fitting must be accomplished in a
careful manner and it has to be taken into account that the individual
amplitudes and decay time constants are coupled parameters. There
are efforts to circumvent this limitation, for example by recording the
acceptor rise time [36, 37]. This very promising approach, however,
lacks sensitivity and can only give reliable results for lower energy
transfer efficiencies. For this reason, most time domain FRET studies
mainly rely on the analysis of the FLT of the quenched donor chromophore. This is valid as long as relative statements in a semiquantitative manner are made or if the second time component of the donor
decay function is kept constant for different transfer efficiencies. A
more detailed discussion can be found in reference 33.
FLT imaging is a powerful tool for the quantitative determination of FRET processes. However, for an evaluation of the FRET
efficiency, the knowledge of the FLT of the FRET donor in the
absence of the FRET acceptor is required. Therefore, control samples are required where no FRET occurs. The FLT is an intrinsic
and specific property for a given chromophore, however only in a
defined environment. In a cell, this means that the observed FLT of
a chromophore such as an autofluorescent protein (AFP) may considerably deviate between different measurements depending on
the fusion partner of the fluorescent protein, its cellular localization, pH value, ionic strength, and more factors. Therefore, the
control sample should differ from the actual samples only in that no
specific interaction with an acceptor and therefore no FRET occur.
All other parameters, such as the protein which the fl
­ uorescent protein is attached to and its localization, should be kept constant to
avoid data misinterpretation. At very high expression levels of
440
Sébastien Peter et al.
fluorescent proteins, some FRET may occur due to high concentrations of the target proteins and therefore a minor probability that a
donor and an acceptor approach close enough for FRET to occur.
In those cases, a control where the FRET donor is expressed along
with a FRET acceptor that lacks its fusion partner while still targeted to the appropriate cellular compartment (e.g., via a nuclear
localization signal) may be more suitable than the estimation of a
FRET donor alone. This way, FRET activity without specific protein–protein interaction can be singled out.
While FRET-FLIM is valuable for studies such as listed above,
even a qualitative interpretation of protein–protein interaction studies becomes difficult for very low energy transfer efficiencies.
Unfortunately, many biological studies incorporate large interacting
proteins which force the donor and acceptor chromophore to relatively far remote distances, even for a positive interaction. As a consequence, the FRET efficiency is very low and according to Eq. 5
the reduction of the donor FLT is only marginal. Together with
local inhomogeneities of the FLT caused by the individual nanoenvironment sensed by the chromophore, it is often difficult to
judge a protein–protein interaction positive or not. Here, the
FIDSAM technique can also be applied, as it uncovers FRET activity
due to the inherent multiexponential decays in FRET-active sample
regions. This way, even marginal reductions of the donor FLT due
to FRET can be discriminated from FLT reductions caused by environmental factors such as the pH value [38]. For a more sophisticated protocol using multiple FRET-systems (see Note 1).
FLIM is a method that generates almost no false-positive
results; however, there is a risk of getting false negatives. The
apparent FLT of the FRET donor may be assigned to a free donor
although an interaction with a second protein with a fused acceptor actually occurs in the following cases:
Cause
Remarks
Expression level
of the FRET
donor much
lower than
expression level
of the acceptor
The measured FLT is a mixture of free FRET donor
and FRET donor with an acceptor bound to it.
Depending on the stoichiometry, this value may be
quite close to the FLT of a free FRET donor.
Therefore, the expression levels of donor and
acceptor chromophore should be comparable
Large interaction
partners
The FRET donor FLT may not be shortened
although its fusion partner interacts with an
acceptor-bound protein when the target proteins
are so large that the two fluorophores are separated
by a distance well beyond the Förster radius. It may
be useful to attach the fluorescent labels to different
sites of the proteins in order to check different
sterical arrangements of the two proteins of interest
(continued)
Fluorescence Microscopy
441
Cause
Remarks
Blocking of
interaction
sites by
fluorescent
protein
An interaction between two target proteins may be
frustrated when the attached fluorescent proteins
impose a sterical barrier that blocks interaction sites
within the proteins. For overcoming this, see above
Proteolytic
cleavage at the
linker between
target protein
and fluorescent
protein
The fluorescent protein may be cleaved from its
fusion partner in vivo. This can be evaluated by
checking the size of a fusion protein by
downstream methods such as SDS-PAGE with
subsequent western blot. If the fluorescent protein
is cleaved from its fusion partner, vectors with
different linkers may be used or the attachment site
of the fluorescent protein changed
3.1.2 Bimolecular
Fluorescence
Complementation
Besides the discussed FRET analysis there is another prominent
optical technique to determine protein–protein interaction in vivo,
which relies on the bimolecular fluorescence complementation
(BiFC) [39–41]. The BiFC approach utilizes a specialty of AFPs,
which will be introduced in detail in Subheading 4.1.
3.2 In Vivo Diffusion
Studies
To understand the dynamics of cellular function, the investigation
of protein mobility inside the cell is of high importance. This can
be achieved by diffusion studies of distinct fluorescent-labeled proteins. The most accurate techniques to obtain precise diffusion
coefficients are fluorescence correlation spectroscopy (FCS) methods [42]. These techniques are quite sophisticated and typically
require extensive data acquisition times of several hours. Moreover,
their application to Arabidopsis and other plant cells has only been
demonstrated in exceptional cases. This is mainly because FCS
intrinsically relies on the detection of single emitters and the signalto-­noise ratio drastically decreases in case of background contribution. However, there is another technique, which provides access
to molecular diffusion in a living tissue context. This method uses
fluorescence recovery after photobleaching (FRAP) and can basically be accomplished with any commercial fluorescence microscope [43]. In FRAP, an intensity map is recorded in defined region
of interest (ROI). In a next step, the ROI is irradiated with a high-­
power laser source, typically operating in a pulsed mode to obtain
high power densities. This way, fluorescence dyes in this area are
irreversibly transferred into a nonfluorescent state due to photobleaching and the intensity in the ROI drops to background level.
The fluorescence intensity in the ROI recovers with time due to
molecular diffusion. Recording the evolution of the fluorescence
intensity until a steady-state level is reached (that is, the fluorescently labeled proteins are distributed homogeneously), the diffusion coefficient D can directly be deduced.
442
4
Sébastien Peter et al.
Fluorescence Labels
4.1 Fluorescence
Probes
The key player in fluorescence microscopy is the fluorescence dye
under investigation. A perfect dye system has to meet some requirements regarding its photophysical performance and its suitability
for the distinct biological problem. Firstly, the dye should be highly
photostable, meaning that it will not undergo photobleaching during the time of observation. Moreover, an outstanding brightness
is desirable to achieve good image contrasts. The brightness is
defined as the product of extinction coefficient at a given excitation
wavelength and the fluorescence quantum yield [2]. The first
parameter describes the molecule’s ability to absorb the excitation
light and is connected to Lambert–Beer’s law according to
c ·d
e(l) =
with E as the absorbance, c as the concentration of the
E (l)
dye in solution, and d as the cuvette thickness. The fluorescence
quantum efficiency defines the probability of a dye to decay radiatively after excitation, and is expressed as a quotient of radiative
and non-radiative decay rates according to F = G rad / G rad + G nonrad
with values 0 < F < 1.
Besides these photophysical requirements, the spectral properties of the fluorescence dye have to fit the biological question. This
means that the emission should not overlap with some intrinsic
background luminescence. Especially in plant systems, this is an
issue and complicates the use of dye systems emitting in the far red
spectral regions as they would overlap with the strong fluorescence
of the plant chloroplasts. Moreover, the excitation wavelength of
the dye should be in a spectral region where there is ideally no or
little absorption of the cellular components. This is important
because any absorbance can lead to unspecific emission and arouses
a strong background signal. In addition, light absorption especially
in the near-ultraviolet region can induce physiological effects such
as DNA degradation. In Arabidopsis, also the photoreceptors of the
plant have to be considered. Here, it is not always possible to find a
fluorescence marker which does not interfere with the activity of
the photoreceptors. However, if an external activation is not critical
for the distinct studies or occurs on a timescale significantly longer
than the microscopic study, these influences can be neglected.
Another requirement concerns the specificity of the fluorescence
marker. In contrast to fluorescence-based techniques such as cell
sorting, in fluorescence microscopy it is crucial to exclusively mark
a desired cellular compartment or even a distinct type of protein.
To meet these requirements, industry offers a variety of high-­
performance fluorescence markers. Two prominent suppliers are
Invitrogen and Molecular Probes, who sell, amongst others, the
well-known Alexa dyes and Atto-tec, which provide the so-called
Atto-dyes, which are outstanding concerning brightness and
Fluorescence Microscopy
443
photostability. Despite their well-performing photophysical properties, these synthetic dyes suffer from the fact that they have to be
inserted externally into the cell. While this is relatively feasible for
mammalian cells, it is an issue for plant cells due to the cell wall,
which has to be penetrated. The use of protoplasts, which lack the
cell wall, is an accepted way to circumvent this problem. However,
precise localization studies are no longer possible. A further problem is the specificity of these synthetic dyes. One approach uses very
specialized markers such as the mitotrackers, which exclusively mark
the mitochondria. To this end, an oxidation from a nonfluorescent
to a fluorescent form is achieved with a thiol-specific binding of the
mitotracker in the mitochondria. Other approaches for selective
labeling use specific antibodies, which are covalently bound to the
marker fluorophores. This way, distinct well-defined proteins bind
to the antibody and specific protein labeling is feasible.
Despite these promising developments in external fluorescence
staining, those techniques have two major intrinsic limitations.
The first one concerns cell toxicity. This is an issue for most of the
synthetic fluorescence dyes which are composed of expanded conjugated aromatic systems inherent to their functional principle.
Moreover, the use of specific antibodies can drastically influence
the functionality of the labeled proteins. This is on the one hand
due to the size of the antibodies, which is frequently comparable to
that of the labeled protein or even larger. Moreover, a specific
binding of an antibody might block functional binding sites of the
protein, thus manipulating biological processes such as signal
transduction.
These limitations are largely overcome by the family of AFPs,
which literally have revolutionized fluorescence cell biology within
the past 15 years [44–48]. In contrast to conventional fluorescent
label dyes, AFPs are peptides, which intrinsically contain a chromophoric unit. Accordingly, the AFP genes can be fused to the gene
of interest by molecular techniques. After transient or stable transformation, the corresponding fusion protein is expressed in the
Arabidopsis cells. Depending on the presence of the appropriate
target sequence, the AFP fusion proteins can be directed to different subcellular compartments. While the first AFP, the green fluorescent protein (GFP), was limited in its spectral and functional
properties, site-directed point mutations led to significantly
improved and modified spectroscopic properties of GFP. One
prominent example for this work is the creation of enhanced GFP
(eGFP), a variant of the wild-type GFP (mutation S65T) with
improved photostability and higher brightness due to increased
extinction coefficient and fluorescence quantum yield. Almost any
recent work which uses GFP as a fluorescence tag uses this enhanced
form, even if not explicitly stated. The group of the Nobel laureate
Roger Tsien also extended the spectral range of the AFPs, now
covering the complete visible regime, ranging from the deep blue
444
Sébastien Peter et al.
(blue fluorescence protein, eBFC, λem = 440 nm) to the far red
(mPlum, λem = 648 nm). Thus, multicolor fluorescence in vivo
labeling is possible [49].
All AFPs known to date are relatively small proteins composed
of about 250 amino acids with a molecular weight around 25 kDa
(GFP: 238 amino acids, MW = 26.9 kDa). The tertiary structure of
the AFPs comprises a barrel-shaped morphology, which is composed by a set of 11 β-sheets, which helically wind along a central
c∞-axis of symmetry, forming a barrel-shaped structure. This
β-barrel is capped by an α-helical structure, sealing the inner area
of the barrel from penetration of larger molecules or ions. In the
wild-type form, the chromophore of the AFPs is composed by
three amino acids (Ser–Tyr–Gly), which protrude in the inner part
of the protein shell. After expression of the protein, these amino
acids undergo a maturation process involving a cyclization, dehydration, and oxidation. Interestingly, approaches to synthesize the
isolated chromophores lacking the protein shell resulted in nonfluorescent compounds, indicating that the protein shell significantly impacts the optical properties of the AFPs by stabilizing the
three-dimensional structure of the chromophoric unit. Fine-tuning
the emission properties of these proteins comprises modifications
in both the chromophoric unit as well as the surrounding protein
shell. In one prominent modification of the chromophore itself,
the tyrosine 66 is exchanged by histidine, which shortens the
delocalized π-system, causing the hypsochromic shift of the blue
variant of GFP, (e)BFP. Contrarily, a significant red-shift of the
fluorescence emission, which closes the gap between GFP and
DsRed-type AFPs, can be achieved by exchanging a threonine by a
tyrosine at position 203 (T203Y). In the folded protein, the
π-system of this amino acid will arrange in a way that approaches
the chromophoric unit without the formation of a covalent bond.
This way, a π-stack is formed, which lowers the energy gap between
S0 and S1 state, causing the emission maximum to shift from 505 to
530 nm in the yellow fluorescent protein (YFP) [50]. A different
class of autofluorescent proteins, DsRed, which was found in the
reef coral Discosoma sp., exhibits red fluorescence. Wild-type DsRed
has the intrinsic property to form tetramers, which cannot be separated by physical or chemical means to obtain functional monomeric subunits. The formation of these tetramers hindered an
extended use of DsRed as an in vivo label despite its outstanding
spectral properties with emission showing less cross talk with autofluorescence background. This only changed with the introduction
of the family of monomeric red fluorescent proteins (mRFPs:
mCherry, mPlum, mStraberry, etc. often referred to as mFruits),
which are point mutations of wild-type DsRed where the protein
shell is modified to lose its tendency to aggregate but still maintains its fluorescence functionality [49]. As a result of these efforts,
a variety of AFPs is available today covering the complete spectral
Fluorescence Microscopy
445
region from blue to red. A remarkable class of AFPs comprises
photoswitchable proteins such as DRONPA, which exhibits intense
green fluorescence when excited with λexc = 488 nm. Increasing
laser power causes the protein to switch to a nonfluorescent dark
state. In contrast to photobleaching, this dark state is formed
reversibly and fluorescence can be recovered by irradiation at
400 nm. As this switching between on- and off-states requires
about two orders of magnitude lower irradiation intensities compared to photobleaching, DRONPA is highly suited for FRAP
studies with less risk of cell damage. Moreover, due to the reversibility of the switching process even complex kinetic studies in a
single cell are feasible [51]. Note 2 provides an overview on
frequntly used fluorescent dyes together with appropriate filter sets.
The BiFC approach [39] to study protein–protein interactions
utilizes the unique property of AFPs, where fluorescence is only
observed for the chromophore enclosed in its well-defined protein
shell environment. Accordingly, if only one part of the protein is
expressed, this fragment will be in a fluorescence inactive state. This
can be utilized if the AFP gene is cut into two subsequences. These
subsequences are then fused to the genes encoding the proteins of
interest. If these fusion proteins interact, the two AFP fragments
approach and will eventually orient in a way that they are capable of
reconstituting the complete protein structure, forming a functional
AFP. BiFC, which was initially demonstrated for YFP and is often
referred to as split-YFP, is a very elegant technique to investigate
protein–protein interaction, not only as it is very specific but also as
fragments of different AFP mutants may complement, forming
BiFC-products with distinct spectral emission (multicolor BiFC)
[52]. It is, therefore, possible to investigate multiple competitive
interactions at a single time. Moreover, the BiFC technique is very
sensitive as it works with a zero background and spectral cross talk,
often an issue in FRET studies, cannot occur.
Despite these fascinating possibilities offered by BiFC, there
are also some restrictions. The most important one is the non-­
reversibility of the protein complementation, rendering this
method unsuited for dynamic investigations where transient protein–protein interactions shall be monitored over time. Moreover,
BiFC can also give rise to the measurement of false positives if the
affinity of the two fragments is high enough to form a functional
AFP even if there is no specific interaction between the two fusion
proteins.
4.2
AFP-FRET Pairs
The most prominent AFP-FRET pair is formed from the cyan fluorescent protein (CFP) and the YFP. While the properties concerning spectral overlap and spectral cross talk of this system are rather
ideal, other photophysical parameters restrict its applicability. First,
the optimal excitation wavelength for the donor CFP is 438 nm,
which evokes a strong autofluorescence background. Moreover,
446
Sébastien Peter et al.
Fig. 3 Triple FRET arrangement composed of TagBFP, TagGFP, and TagRFP
CFP is not very bright and has a rather low photostability. These
restrictions require alternative AFP-FRET pairs which nowadays
are available due to the broad spectral varieties. We, therefore, suggest using a red-shifted FRET pair if the individual experimental
design allows for that. While a combination of GFP and the mRFP
already exhibits results superior to the CFP–YFP combination, new
brighter and more photostable constructs such as the Tag-­family are
available. Here, a blue-emitting variant with fairly good spectroscopic properties can also be used and triple FRET studies can be
applied. These studies extend the conventional FRET for a third
component, thus generating an energy migration cascade. In triple
FRET, the excitation energy of a donor chromophore is initially
transferred non-radiatively to a first acceptor. This acceptor, in turn,
can act as a second donor and transfer its excitation energy to a third
chromophore, acting as last acceptor dye. This way, complex interaction studies with up to three participating proteins can be carried
out. In Fig. 3, a triple FRET arrangement composed of AFPs from
the Tag-family is depicted (TagBFP → TagGFP → TagRFP).
When the use of a cyan-emitting AFP is indispensable, we recommend to use Cerulean rather than CFP [53]. Spectrally almost
identical to CFP, Cerulean offers a higher brightness and photostability, albeit it still does not reach the levels of eGFP or YFP.
Moreover, in contrast to CFP Cerulean exhibits a monoexponential
fluorescence decay. This is crucial for FRET-FLIM studies, as the
intrinsic biexponential decay of CFP complicates data evaluation.
4.3 AFPs as Local
Biosensors
The particular properties of the AFPs recommend their use as
molecular in vivo sensors. Due to the protein shell, which shields
the chromophore towards the environment, mainly protons can
directly interact with the fluorophore. As the chromophore
Fluorescence Microscopy
447
equilibrates in a protonated and a deprotonated form, this equilibrium
can be influenced by the local proton concentration. This makes
AFPs in general and GFP in particular a very sensitive local pH-­
sensor [54]. Amongst others, the protonation state can be read out
by FLT measurements and, hence, a FLIM image can be translated
into a pH map. The affinity of protons to penetrate the AFP barrel
structure also depends on the protein the AFP is fused to. For
example, a BRI1-eGFP construct is sensitive to changes in the local
membrane potential due to a specific brassinolide-activated increase
of the P-ATPase activity [55].
Another approach uses directed mutations to induce sensitivity
to distinct external parameters. In one prominent approach, two
cysteines have been introduced into GFP at positions 147 and 204
to be adjacent to each other (roGFP) [56]. In dependence on the
local redox potential, a disulfide bond can reversibly form between
the two thiol groups of the cysteins. Formation of this bond alters
the protonation equilibrium of the chromophore. Thus, roGFP
acts as an optical sensor to probe the local redox potential. A further development to achieve a sensor, which is exclusively sensitive
to changes in the local H2O2 concentration, leads to the HyPer
probe, where a circularly permuted yellow fluorescent protein
(cpYFP) was inserted into the regulatory domain of the prokaryotic H2O2-sensing protein OxyR [57].
AFPs can also be used to sense local salt concentrations. The
most prominent example of the so-called cameleon features is a
protein construct, which links the FRET pair CFP and YFP via
calmodulin [58, 59]. Calmodulin undergoes a conformational
change in the presence of Ca2+ ions, which forces the AFPs to
approach each other. This way, the FRET efficiency is varied in a
Ca2+ concentration-dependent manner and the cameleon construct
can be used for local and highly sensitive Ca2+ probing.
5
Conclusions
This review provides an overview on actual fluorescence microscopy
techniques used in Arabidopsis research today. While in classical
applications, merely the intensity of the signal was used as a source
of information, state-of-the art applications use specific spectroscopic properties of fluorescence label dyes to increase the information content of every single measurement. The major benefit of
these techniques is due to spectral dependence of a fluorophore to
its local nano-environment. Thus, fluorescence microscopy is a
valuable tool for biochemical imaging with subcellular resolution,
helping researchers to further understand biological processes on a
molecular scale. Future developments will certainly further proceed
in this direction. One important issue towards this highly sensitive
and noninvasive technology will rely on further developments in
448
Sébastien Peter et al.
super-resolution microscopy beyond the diffraction limit. Most
likely, in the next few years, these techniques will find their way into
plant research and offer fascinating insights, eventually even with a
molecular spatial resolution. If the super-resolution techniques are
combined with local spectral readout modalities such as the FLT or
the fluorescence emission or excitation spectrum, optical microscopy will further emerge as an analytic technique with the highest
potential. However, the amount of data to be recorded, interpreted,
and correlated will also increase tremendously and, hence, highly
sophisticated mathematical techniques for data evaluation and multivariate data analysis will play a major role. This way, the disciplines
of biology, chemistry, physics, mathematics, and informatics will
further merge together and deep and so-far unforeseeable insight
into cellular processes will be gained.
6
Notes
1. Triple FRET excitation schema: While triple FRET is a very
powerful tool to single out complex protein interactions, it
requires a decent experimental concept, which incorporates
different alternating excitation sources to retrieve the presence
of the individual proteins and then to determine their interaction. Such an excitation schema, which requires either pulsed
laser sources or fast switchable continuous wave (cw) lasers, is
depicted in Table 1.
2. Excitation wavelengths: Choosing the appropriate excitation
wavelength and filter sets is crucial for fluorescence microscopy. In Table 2, optimal and acceptable excitation wavelengths and suited emission filters for common fluorescence
labels are arranged. For FRET applications the lowest excitation wavelength should be chosen to avoid direct acceptor
excitation. For filters, at least for the donor, a band-pass filter
is required to block acceptor emission. For the acceptor, or the
last dye in a triple FRET energy migration chain, respectively,
a long-pass filter will work fine.
3. FLIM excitation rate: For FLIM studies, distinct settings for
the pulsed excitation source are required. At first, an appropriate repetition rate should be chosen. Using common fluorescence dyes, repetition rates between 10 and 40 MHz are well
suited, and 80 MHz might work as well; however, the subsequent pulse might start before the intensity has completely
decayed. Lower repetition rates than 10 MHz should be
avoided, as due to the long time span between two pulses, significant readout and thermal noise is collimated.
4. FLIM excitation power: The excitation power in a FLIM experiment should be set to values, where 1 % of the excitation
Fluorescence Microscopy
449
Table 1
Excitation schema for a triple FRET study. Using three independent excitation wavelengths,
the presence and the interaction between three proteins can be deduced
λ3
D3
D3 present
D3
No D3 present
λ2
D2 present, no D2 → D3 interaction or D3 not present
D2
D3
D2 present, D2 → D3 interaction
No D2 present
D2
λ1
D1 present, no D1 → D2 interaction or no D2 present
D1
D1, D2 present, D1 → D2 interaction, no D2 → D3 interaction or D3 not present
D2
D3
D1
D1, D2, D3 present, D1 → D2 → D3 interaction
No D1 present
Table 2
Optimal (green light) and acceptable (orange light) excitation wavelengths and emission filters
for common in vivo labeling dyes. LP long-pass, BP band-pass. BP numbering: AAA/BB:
AAA = the central wavelength; BB = spectral width
Dye
Optimal λexc
(nm)
Acceptable λex
(nm)
Emission filter
BFP, DAPI
360
405
LP420
CFP, eCFP, Cerulean
438
457
LP460, BP480/40
GFP, FITC (fluorescein), Alexa488, Atto488
488
457
LP500, BP525/50
YFP, eYFP, Venus, Citrine
514.5
488
LP520, BP540/35
mRFP
550
532, 514.5
LP600, BP610/20
mCherry
580
532
LP600, BP640/80
pulses cause a photon detection event on the detector. Hence,
detection rates must not be higher than 100 kHz (10 MHz
excitation rate) to 800 kHz (80 MHz excitation rate). For
higher detection rates, the probability for two photons to be
emitted by the sample while the detector is still in a dead time,
where photons cannot be counted, increases. This, in turn,
leads to an overestimation of early photon arrival times and the
resulting FLT are too low. This effect often is referred to as
“pile-up” effect and should carefully be avoided.
450
Sébastien Peter et al.
5. FLIM channel width: Modern TCSPC electronics allow for
channel widths as small as 1 ps. Usually, such short binning
times are not required and time intervals between 32 and
~200 ps provide good fitting results comparable to those
obtained from high time resolution. The larger time intervals,
in turn, take advantage of a faster build of the histogram, as
photons of similar arrival times are binned together, which leads
to significantly shorter data acquisition times. We recommend
the highest time resolution only for measurements where ultrafast dynamics have to be monitored. This is, e.g., the case for
recording the acceptor rise time in quantitative FRET studies.
6. Instrument response function (IRF) in FLIM studies: In TCSPC
data analysis, the laser pulse is regarded as perfect delta function. The IRF corrects deviations from this delta function
inherent in any experimental configuration. Hence, the IRF is
broadened and asymmetric compared to the delta function due
to the finite pulse width of the laser pulse and electronically
caused time delays. To obtain quantitative data, the IRF must
be known to be convoluted with the fit function (Eq. 2). To
record an IRF, one may use back reflection of the laser beam at
a coverslip without any emission filter. The blocking efficiency
of dichroic beam splitter for back-reflected light is not sufficient to block all light. We, therefore, recommend to use greyfilters to reduce laser intensity when recording the IRF. Some
pulsed laser diodes provide a mechanical power adjustment.
This option should only be used in exceptional cases, as the
pulse shape can vary with the output power. A different way to
record an IRF is accomplished using luminescence, which proceeds on a very fast timescale. For example for a decent concentration in the micromolar range gold nanoparticles, which
are commercially available, exhibit a strong red-shifted luminescence due to excited surface plasmons, which emit quasiinstantaneously after the excitation and the additional time
jitter can be neglected. Since the fit quality strongly depends
on the IRF and since the IRF is highly sensitive to any changes
in the experimental setup (especially excitation repetition rates,
but also filters or changed detection modalities), we recommend to record an IRF at least once a day.
7. Objectives: For optimal spatial resolution, high NA objectives
are required. Optimal results are obtained using oil immersion
objectives with a magnification of 60× to 100×. The objectives,
however, require the observation of the sample through a
microscopy coverslide (typical thickness 0.18 mm). If this limitation is not acceptable for a distinct investigation we suggest
the use of air objectives with 100× magnification. Note that
changing the objective NA and magnification requires a re-­
dimensioning of the image pinhole.
Fluorescence Microscopy
451
References
1. Stephens DJ, Allan VJ (2003) Light microscopy techniques for live cell imaging. Science
300:82–86
2. Lakowicz JR (2006) Principles of fluorescence
spectroscopy. Kluwer, New York
3. Schleifenbaum F, Blum C, Subramaniam V,
Meixner AJ (2009) Single molecule spectral
dynamics at room temperature. Mol Phys
107:1923–1942
4. van Munster EB, Gadella TW (2005)
Fluorescence lifetime imaging microscopy
(FLIM). Adv Biochem Eng Biotechnol
95:143–175
5. Ntziachristos V (2006) Fluorescence molecular
imaging. Annu Rev Biomed Eng 8:1–33
6. Pepperkok R, Ellenberg J (2006) High-­
throughput fluorescence microscopy for systems
biology. Nat Rev Mol Cell Biol 7:690–696
7. Suzuki T, Matsuzaki T, Hagiwara H, Aoki T,
Takata K (2007) Recent advances in fluorescent labeling techniques for fluorescence
microscopy. Acta Histochem Cytochem 40:
131–137
8. Valeur B (2002) Molecular fluorescence: principles and applications. Wiley-WCH, Weinheim
9. Shotton DM (1989) Confocal scanning optical
microscopy and its applications for biological
specimens. J Cell Sci 97:175–206
10. Abbe E (1904) Abhandlungen über die Theorie
des Mikroskops. Verlag G. Fischer, Jena
11. Axelrod D, Gerard M, Ian P (2003) Total
internal reflection fluorescence microscopy in
cell biology. In: Methods in enzymology.
Academic Press 36:1–33. Biophotonics, Part B,
Elsevier (Amsterdam). Editors: Gerard Marriot
and Jan Parker
12. Betzig E et al (2006) Imaging intracellular fluorescent proteins at nanometer resolution.
Science 313:1642–1645
13. Rust M, Bates M, Zhuang X (2006) Subdiffraction-­limit imaging by stochastic optical
reconstruction microscopy (STORM). Nat
Methods 3:793–796
14. Heilemann M et al (2008) Subdiffraction-­
resolution fluorescence imaging with conventional fluorescent probes. Angew Chem Int Ed
47:6172–6176
15. Lippincott-Schwartz J, Patterson GH (2009)
Photoactivatable fluorescent proteins for
diffraction-­limited and super-resolution imaging. Trends Cell Biol 19:555–565
16. Hell S (2004) Strategy for far-field optical
Descargar