Characterization of Csy4 motifs through the construction of RNAregulated genetic feedback and RNA stability systems. Gin (officially Raúl) García Martín Supervisor: Ebbe Sloth Andersen Aarhus University December 2020 1 Abstract Csy4 is an endoribonuclease of the Cas family that cleaves RNA and generates mature CRISPR-derived RNAs. Its applications in RNA bionanotechnology have yet to be fully explored. This project aims to characterize the binding and cleaving mechanism of the Csy4 enzyme, as well as to find possible applications in synthetic biology, through two separate experiments. First, the stability of an RNA origami-based system which contains fluorescent aptamers and Csy4 cleaving motifs will be analyzed in the presence of an active Csy4. What is more, this rationally designed RNA assembly could incorporate new elements such as aptamers or riboswitches, allowing for a sequential control of reactions. A possible application for this is a kill-switch which could, for example, prevent genetically modified microorganisms from propagating outside of the lab. Second, the binding affinity of different Csy4 motifs to a deactivated Csy4 will be tested, along with the design of a negative feedback loop. This experiment had to be cancelled, due to the corona restrictions set in December. This loop’s behavior was expected to be that of an oscillator; therefore, it could be used as an element for designing and regulating logical circuits. 2 Index Abstract ............................................................................................................................. 1 1 Preface ......................................................................................................................... 3 2 Introduction ................................................................................................................. 4 2.1 The structure of RNA ............................................................................................. 4 2.1.1 RNA secondary structure ................................................................................ 4 2.1.2 RNA tertiary structure ..................................................................................... 6 2.1.3 RNA quaternary structure ................................................................................ 6 2.2 RNA nanotechnology ............................................................................................. 6 2.2.1 RNA synthetic biology .................................................................................... 6 2.2.2 Aptamers .......................................................................................................... 7 2.2.3 Riboswitches.................................................................................................... 7 2.2.4 RNA origami ................................................................................................... 7 2.2.5 CRISPR and the Csy4 endoribonuclease ........................................................ 9 2.3 Objectives ........................................................................................................... 9 3 Methods ....................................................................................................................... 9 3.1 Technical description of methods used ................................................................... 9 3.1.1 PCR.................................................................................................................. 9 3.1.2 Nanodrop ....................................................................................................... 10 3.1.3 Flow cytometry .............................................................................................. 10 3.1.4 Computer-aided design .................................................................................. 11 3.1.5 Chemical synthesis of oligonucleotides ........................................................ 11 3.1.6 gBlocks .......................................................................................................... 11 3.1.7 Golden Gate assembly ................................................................................... 12 3.2 Experimental section ............................................................................................. 12 3.2.1 Kill-switch ..................................................................................................... 12 3.2.2 Negative feedback loop ................................................................................. 16 4 Results and discussion ............................................................................................... 19 4.1 Kill-switch ............................................................................................................ 19 4.2 Negative feedback loop ........................................................................................ 27 5 Conclusion ................................................................................................................. 30 6 References ................................................................................................................. 31 7 Supplementary material ............................................................................................. 33 3 1 Preface This project wouldn’t have been possible without the help of the Andersen lab members. The realization of this project is driven by my passion for Synthetic Biology. It is one of my dreams to be able to truly apply the principles of engineering to biological systems, walking on a landfill of endless opportunities. My wish is that all the potential that nature hasn’t projected could be put to good use, and also give us insight and reflections on the origin and evolution of life, as well as the role we should play on it. Taking this into account, Ebbe Sloth Andersen kindly introduced me to two of his team members, George and Michael. All together, they guided me and helped me choose a topic that would fit my passion, as well as the lab’s research lines. That is how we decided to follow up on one of Michael’s experiments. Michael had been working with the enzyme Csy4, and noticed there were difficulties when trying to measure its binding affinity to several motifs. Then he suggested we could construct a negative feedback loop, where we could at the same time test for different binding affinities. What is more, we decided to build RNA origamis containing Csy4 motifs, and quantify and analyze their digestion when co-expressed with the Csy4. However, the multi-gene assemblies for the negative feedback loop were unsuccessful, and, due to the new corona restrictions, I lacked of the opportunity to redo or reframe the experiment further. Still, I was able to gather valuable data, which will be shown throughout the report. First of all, I would like to thank Ebbe for accepting my Individual Project, welcoming me into his lab and building an inclusive atmosphere, which is something very important to me. I would like to enormously thank George for patiently and correctly guiding me through each step. I want to thank Michael for his disposition and help. Thanks to Cody for letting me use his software. And thanks to Néstor, Bente and the rest of the team for making the lab a cozy place to work in. 4 2 Introduction 2.1 The structure of RNA Ribonucleic acids (RNA) are negatively charged polymers assembled from four different monomers. Each monomer is composed of one of the four standard nucleic acid bases (the pyrimidines uracil and cytosine, and the purines guanine and adenine) attached to a conserved phosphorylated sugar (Westhof & Auffinger, 2000). The primary structure of RNA is the sequence of bases attached to the sugar–phosphate backbone. In salty water, the RNA molecules fold via Watson–Crick base pairing between the bases (A with U, G with C or U) leading to double-stranded helices interrupted by singlestranded regions in internal loops or hairpin loops (Westhof & Auffinger, 2000). The enumeration of the base-paired regions or helices constitutes a description of the secondary structure of RNA. Under appropriate conditions, structured RNA molecules undergo a transition to a 3D fold, where the helices and the unpaired regions are precisely organized in space. This is called the tertiary structure of RNA. This folding process usually depends on the presence of divalent ions like magnesium ions and on the temperature (Westhof & Auffinger, 2000). The final structure level is the quaternary structure, where separate RNA molecules interact with each other or with proteins. 2.1.1 RNA secondary structure RNA secondary structure is the base-pairing interactions within a single ribonucleic acid polymer, or between two polymers. It can be represented as a list of paired bases. The secondary structures of biological RNA is single-stranded and often forms complex and intricate base-pairing interactions. This occurs due to the fact that RNA has an extra hydroxyl group in the ribose sugar, which increases its ability to form hydrogen bonds (Dirks, Lin, Winfree, & Pierce, 2004). Stem-loop structures The secondary structure of nucleic acid molecules can be often decomposed into stems and loops. The stem-loop structure (also often referred to as a "hairpin"), where a basepaired helix ends in a short unpaired loop, is extremely common and is a building block for larger structural motifs such as cloverleaf structures, four-helix junctions like those found in transfer RNA (Svoboda & Di Cara, 2006). Internal loops (a short series of unpaired bases in a longer paired helix) and bulges (regions where one strand of a helix has "extra" inserted bases with no counterparts in the opposite strand) are also frequent. See Figure 1 for an overview of the main elements of the stem-loops. 5 Figure 1. RNA stem-loops. (a) A schematic overview of an RNA stem-loop depicting the important parameters for the role of such a hairpin RNA. Extracted from (Svoboda & Di Cara, 2006). There are many secondary structure elements of functional importance to biological RNAs; some famous examples are the Rho-independent terminator stem-loops and the tRNA cloverleaf. Pseudoknots A pseudoknot is a nucleic acid secondary structure containing at least two stem-loop structures in which half of one stem is intercalated between the two halves of another stem (Figure 2). Pseudoknots fold into knot-shaped three-dimensional conformations but are not true topological knots. Figure 2. An example of the RNA pseudoknot structure, from the human telomerase. Adapted from (J. L. Chen & Greider, 2005). The base pairing in pseudoknots is not well nested; that is, base pairs occur that "overlap" one another in sequence position. This makes the presence of general pseudoknots in nucleic acid sequences impossible to predict by the standard method of dynamic programming, which uses a recursive scoring system to identify paired stems and consequently cannot detect non-nested base pairs. However, limited subclasses of pseudoknots can be predicted using modified dynamic programs (Rivas & Eddy, 1999). 6 2.1.2 RNA tertiary structure Nucleic acid tertiary structure is the three-dimensional shape of a nucleic acid polymer. RNA and DNA molecules are capable of diverse functions ranging from molecular recognition to catalysis. Such functions require a precise three-dimensional tertiary structure. While such structures are diverse and seemingly complex, they are composed of recurring, easily recognizable tertiary structure motifs that serve as molecular building blocks. Double helix The double helix is an important tertiary structure in nucleic acid molecules which is intimately connected with the molecule's secondary structure. A double helix is formed by regions of many consecutive base pairs. The nucleic acid double helix is a spiral polymer, usually right-handed, containing two nucleotide strands which base pair together. A single turn of the helix constitutes about ten nucleotides, and contains a major groove and minor groove, the major groove being wider than the minor groove (Soediono, 1989). 2.1.3 RNA quaternary structure Nucleic acid quaternary structure refers to the interactions between separate nucleic acid molecules, as well as between nucleic acid molecules and proteins. This concept is an analogy to protein quaternary structure, so it hasn’t been highly specified (Jones & FerréD’Amaré, 2015). An example of RNA quaternary structure would be the ribosome, where different RNA monomers interact with each other and with proteins. 2.2 RNA nanotechnology 2.2.1 RNA synthetic biology Synthetic biology is a set of tools for biology that is based on principles such as abstraction, standardization and automated construction in order to modify and create new biological systems. Most synthetic biology efforts so far have focused on engineering gene circuits that rely on protein-DNA interactions. Recent advances in RNA biology and nucleic acid engineering, however, are inspiring the use of RNA components in the construction of synthetic biological systems (Isaacs, Dwyer, & Collins, 2006). Indeed, structural motifs in naturally occurring RNAs and RNPs can be employed as new molecular parts for synthetic biology to facilitate the development of novel devices and systems that modulate cellular functions. For instance, functional RNA molecules such as new artificial RNA aptamers and RNA enzymes (ribozymes) can be used to reprogram existing gene regulatory systems (Saito & Inoue, 2009). Another option is to directly used small RNAs as transcriptional activators by preventing the formation of terminator hairpins (Chappell, Takahashi, & Lucks, 2015). 7 2.2.2 Aptamers Aptamers are oligonucleotide or peptide molecules that bind to a specific target molecule. They are generally created by selecting them from a large random sequence pool (by “Systematic Evolution of Ligands by EXponential enrichment”, or SELEX), but natural aptamers also exist in riboswitches. Aptamers can be used for both basic research and clinical purposes as macromolecular drugs. Aptamers can be combined with ribozymes to self-cleave in the presence of their target molecule (Mallikaratchy, 2017). 2.2.3 Riboswitches Riboswitches are a type of mRNA structure that help regulate gene expression and often bind a diverse set of ligands. Riboswitches determine how gene expression responds to varying concentrations of small molecules in the cell (Figure 3) (Y. Chen & Varani, 2010). This motif has been observed in flavin mononucleotide (FMN), cyclic di-AMP (cdi-AMP), and glycine. Riboswitches are said to show pseudoquaternary structure: several structurally similar regions of a single RNA molecule fold together symmetrically. Because this structure arises from a single molecule and not from multiple separate molecules, it cannot be referred to as true quaternary structure (Jones & Ferré-D’Amaré, 2015). Depending on where a riboswitch binds and how it is arranged, it can, for example, suppress or allow a gene to be expressed (Y. Chen & Varani, 2010). Figure 3. A synthetic theophylline-responsive riboswitch variant adopts a fold that sequesters the ribosome binding site (RBS) in the mRNA transcript. In the presence of the substrate, theophylline, the riboswitch adopts a conformation where the aptamer is bound to theophylline. The RBS is then released and protein translation can take place. Extracted from (Seeliger et al., 2012). 2.2.4 RNA origami Self-folding of an information carrying polymer into a compact particle with defined structure not only is a foundation of biology but it also offers attractive potential as a synthetic strategy. Over the past three decades, nucleic acids have been used to create a variety of complex nanoscale shapes and devices. RNA offers unique application potentials over DNA 8 structures, such as functional diversity, economical production via genetic expression, and amenability for intracellular applications. RNA origami is synthesized by enzymes that fold RNA into particular shapes (in vivo). This method was developed by researchers from Aarhus University and California Institute of Technology (C. W. Geary & Andersen, 2014). Even though many computer algorithms are present to help with RNA folding, none can fully predict the folding of RNA of a singular sequence (C. W. Geary & Andersen, 2014). Computer-aided design Computer-aided design of the RNA origami structure is divided into three main steps; creating the 3D model, writing the 2D structure, and designing the sequence. First, a 3D model is constructed using tertiary motifs from existing databases. This is necessary to ensure the created structure has feasible geometry and strain. Next, the 2D structure is created by describing the strand path and base pairs from the 3D model. This 2D blueprint introduces sequence constraints, creating primary, secondary and tertiary motifs. Last, sequences compatible with the desired structure are designed. For this purpose, design algorithms can be used (Sparvath, Geary, & Andersen, 2017). The double crossover (DX) The RNA origami method uses double-crossovers (DX) to arrange the RNA helices in parallel to each other to form building blocks. While DNA origami requires the construction of DNA molecules from multiple strands, the DX molecules can be made from only one strand for RNA, by adding hairpin motifs to the edges and kissing-loop complexes on the internal helices (see Figure 4). The addition of more DNA molecules on top of one another creates a junction known as the dovetail seam. This dovetail seam has base pairs that cross between adjacent junctions; thus, the structural seam along the junction is sequence-specific (Sparvath et al., 2017). Figure 4. Design of DNA origami. The main RNA strand is shown in black, while the colored strands are kissing loop complexes (4-T loop, see structure inside the black box). The kissing loops can cross the dovetail seem and stabilize it. Extracted and adapted from (Rothemund, 2006). 9 2.2.5 CRISPR and the Csy4 endoribonuclease The CRIPSR-Cas system is a bacteria and archaea adaptive immune systems that relies on small RNAs for defense against invasive genetic elements. CRISPR (clustered regularly interspaced short palindromic repeats) genomic loci are transcribed as long precursor RNAs, which must be enzymatically cleaved to generate mature CRISPRderived RNAs (crRNAs) that serve as guides for foreign nucleic acid targeting and degradation (cleavage by a Cas family enzyme). This processing occurs within the repetitive sequence and is catalyzed by a dedicated Cas6 family member in many CRISPR systems. In Pseudomonas aeruginosa, crRNA biogenesis requires the endoribonuclease Csy4, which recognizes its RNA substrate via sequence- and structure-specific contacts and cleaves them at the 3’ end of a five-basepair stem-loop encoded by the CRISPR repeat. RNA cleavage by Csy4 is divalent metal ion-independent and requires chemical activation of a ribosyl 2’-hydroxyl for internal nucleophilic attack on the phosphodiester bond. (Sternberg, Haurwitz, & Doudna, 2012). 2.3 Objectives 1. Characterization of the binding and cleaving mechanism of the Csy4 endoribonuclease. 2. Analysis of the stability of an RNA origami-based system in the presence of Csy4. The design contains fluorescent aptamers and Csy4 cleaving motifs. The eventual objective is the design of a kill-switch. 3. Characterization of the binding affinity of dCsy4 to different Csy4 RNA motifs for the design of a negative feedback loop, which can regulate logical circuits. 3 Methods 3.1 Technical description of employed methods 3.1.1 PCR Polymerase Chain Reaction (PCR) is an enzymatic process where specific regions of DNA are duplicated repeatedly to yield millions of copies of a particular sequence in a matter of few hours. It involves repeated cycles of heating and cooling of a reaction mixture containing DNA template, DNA polymerase, primers, and nucleotides, being the DNA template the one containing the target sequence. Primers are short chains of nucleotides which locate the specific target DNA of interest and bind to it upon cooling, through complementary base pairing. They act as a starting point for DNA polymerase to create the new complementary strand. DNA polymerase is an enzyme that synthesizes new strands of DNA complementary to the target sequence (Jalali, Zaborowska, & Jalali, 2017). Each cycle of PCR consists of three steps (Jalali, Zaborowska, & Jalali, 2017): • Denaturation. The reaction mixture is heated to over 90˚C to unwind the double helix by breaking apart hydrogen bonds. 10 • • Primer annealing. The reaction mixture is cooled to 45-65˚C to allow for primer annealing. Forward and reverse primers hybridize through complementary base pairing to opposite strands of the DNA. Extension. The reaction mixture is heated to 72˚C toward the optimal temperature or DNA polymerase enzyme activity. The DNA polymerase then binds to the primer-template hybrid complex and assembles a new complementary strand using the free nucleotides in the reaction mixture. After extension, the cycle is restarted. The PCR method used for this project involved using the Q5® High-Fidelity DNA Polymerase (NewEngland BioLabs INC., Ipswich, MA). Next, the Macherey-Nagel Inc.’s PCR clean-up and DNA purification kit and protocol were followed. The PCR clean-up is preceded and complemented by the treatment of PCR product DNA with DpnI (methylation-sensitive restriction enzyme). Plasmids propagated into Escherichia coli are methylated and thus cut by DpnI. This is done to minimize wrong clones by eliminating the DNA not produced by the PCR. 3.1.2 Nanodrop The NanoDrop® microvolume sample retention system (Thermo Scientific NanoDrop Products) is a high-sensitivity fluorescent analysis of limited mass. It functions by combining fiber optic technology and natural surface tension properties to capture and retain minute amounts of sample independent of traditional containment apparatus such as cuvettes or capillaries. Different molecules absorb different wavelengths of light. DNA happens to absorb light at the wavelength of 260nm, proteins at 280nm and other contaminants at around 230. The NanoDrop® shines light of those different wavelengths through the sample and records how much light got absorbed. It then uses a built-in equation to convent absorption at 260nm into a DNA concentration. These ratios give a rough idea of how pure the sample is, and what types of contaminants might be present. 3.1.3 Flow cytometry Flow cytometry is a sophisticated instrument that can measure the optical and fluorescence characteristics of a single cell or another particle in a fluid stream when they pass through a light source. Antibodies or dyes can be used to detect several parameters such as size, granularity and fluorescence (Adan, Alizada, Kiraz, Baran, & Nalbant, 2017). The principles behind flow cytometry are light scattering and fluorescence emission, which occurs when light from the excitation source hits the moving particles. Light scattering is directly related to structural and morphological properties of the cell while fluorescence emission is proportional to the amount of fluorescent probe bound to the cell or cellular component (Adan et al., 2017). The fluorescence is shown in A.U. (arbitrary units), since absolute quantification can’t be performed without standards. 11 3.1.4 Computer-aided design ViennaRNA Package The ViennaRNA Package consists of a C code library and several stand-alone programs for the prediction and comparison of RNA secondary structures (Lorenz et al., 2011). The main secondary structure prediction tool is RNAfold, which uses a set of thermodynamic energy parameters to compute the minimum free energy (MFE) and back-traces an optimal secondary structure. The minimum free energy structure of a sequence is the secondary structure that is calculated to have the lowest value of free energy. NUPACK: Nucleic Acid Package NUPACK is a software suite for the analysis and design of nucleic acid structures, devices, and systems. The NUPACK web application enables analysis and design of the equilibrium basepairing properties of one or more test tubes of interacting nucleic acid strands (Zadeh et al., 2011). Among others, it can calculate the partition function and minimum free energy (MFE) secondary structure for pseudoknot-free complexes of arbitrary numbers of interacting RNA or DNA strands. NUPACK was used to analyze the folding of the different suboptimal structures obtained with Cody Geary’s RNA design software (Geary et al., 2021) rendered. The structures with the lowest free-energy were selected. KineFold The Kinefold web server provides a web interface to simulate nucleic acid folding paths at the level of nucleation and dissociation of RNA/DNA helix regions (Xayaphoummine, Bucher, & Isambert, 2005). The folding path consists of a discrete series of secondary structures obtained by the successive addition or removal of single helices. An advantage to KineFold is that both pseudoknots and knots are efficiently predicted, as simple geometrical and topological constraints are taken into account. If the simulated molecular time is long enough, a stochastic (randomly determined) simulation like KineFold will eventually find the lowest free energy structures. KineFold was used together with NUPACK to determine and select the lowest freeenergy structures obtained with Cody Geary’s RNA design software. 3.1.5 Chemical synthesis of oligonucleotides This chemical synthesis of oligonucleotides was performed by Integrated DNA Technologies Inc. (IDT). 3.1.6 gBlocks gBlocks Gene Fragments are double-stranded DNA fragments 125–3000 bp in length with a median error rate of less than 1:5000. They are synthesized by Integrated DNA Technologies Inc (IDT). 12 Each gBlocks Gene Fragment goes through a quality control process, which includes size verification by capillary electrophoresis and sequence identification by mass spectrometry. The advantage relies on the fact that each this rigorous testing ensures that most recombinant colonies obtained from cloning each gBlocks Gene Fragment will contain the desired insert. 3.1.7 Golden Gate assembly Golden Gate cloning or Golden Gate assembly is a molecular cloning method that allows the simultaneous and directional assembly of multiple DNA fragments. The enzymes used are the T4 DNA ligase as well as Type IIs restriction enzymes, such as BsaI or BsmBI. This kind of Type II restriction enzymes have the ability to cut DNA outside of their cleavage motifs and, therefore, can create non-palindromic overhangs. Since more than 200 potential overhang sequences are possible, multiple fragments of DNA can be assembled by using combinations of overhang sequences. In practice, this means that Golden Gate cloning is typically scarless. Apart from that, since the assembled product does not have a Type II restriction enzyme cleaving motif, it cannot be cut again, so the reaction is irreversible (Weber, Engler, Gruetzner, Werner, & Marillonnet, 2011). In this project, we followed two different Golden Gate protocols: one for part assemblies and another one for multiple assemblies (both for cassette and multi-gene cassette assemblies). They can be found under the “Supplementary material” section. 3.2 Experimental section 3.2.1 Kill-switch 3.2.1.1 Design of RNA origami tiles Five different RNA origami tiles were designed using the 2H-AE as a starting point (Figure 5). This tile was constructed by Geary et al. (C. Geary, Rothemund, & Andersen, 2014). The DNA constructs, as well as their primers, were designed and assembled in the Benchling™ web tool. Figure 5. 2D blueprint of the 2H-AE tile. Both the 5’ and 3’ ends can be seen in the last (lower) line, since it is a single-stranded (ss) RNA origami tile. Nucleotides depicted as “N” are unspecified. A special software is used to select them and generate different variants, usually based on parameters such as the minimum free energy (MFE). 13 The 2D blueprint of this tile was modified by using the Sublime Text text editor. The first common modification for all designs consisted of adding the iSpinach aptamer on the upper left cornering sequence. First, the base tile was created by adding the sequences of the Csy4 motif in the three selected corners (Figure 6a). Figure 6. 2D blueprint of the RNA origami tile designs. (A) 2H_iSPI_0CSY4 tile, containing only the iSpinach aptamer (sequence highlighted in pink). The three corners for Csy4 motif insertion are numbered from 1 to 3 (white numbers). (B) 2H_iSPI_3CSY4 tile, containing both the iSpinach aptamer and the Csy4 motifs inserted simultaneously in the three possible corners. For this project, ViennaRNA’s secondary structure prediction through energy minimization function was used along with Cody Geary’s RNA design software (Sparvath et al., 2017) to generate suboptimal structures within a given energy range of the optimal energy of the base tile. Thirty different models for the targeted RNA origami tile were obtained. From them, we selected those with the lowest MFE, and analyzed them once more using NUPACK and KineFold. This process was designed in order to select the structure that rendered the lowest MFE in consensus with the three pieces of software, as they employ different methods for calculating this parameter. This rendered the structure shown in Figure 6b, and using this one as a starting point, four other tiles were generated: • • • • • 2H_iSPI_0CSY4, or M0 for short, no Csy4 motifs. 2H_iSPI_3CSY4, Csy4 motifs on the three corners. 2H_iSPI_CSY4_1, or M1, Csy4 motif on corner position number 1 (Figure 6a). 2H_iSPI_CSY4_2, or M2, Csy4 motif on corner position number 2. 2H_iSPI_CSY4_3, or M3, Csy4 motif on corner position number 3. 14 3.2.2.2 Synthesis of RNA structures 1. Sequence synthesis The five designed DNA sequences were ordered from Integrated DNA Technologies Inc. They were synthesized by using the chemical synthesis of oligonucleotides with the phosphoramidite approach. They were amplified in a thermocycler by PCR, which used primers that targeted our designed sequences and also added the prefixes and suffixes later needed for the Golden Gate assembly. 2. Part assembly For the part assembly, the five different PCR products were assembled into an entry vector, pECO85. The method used was the Golden Gate, and five PCR tubes with the mixture specified by the protocol were inserted in the thermocycler and the Golden Gate protocol was run (see “Supplementary material”). 3. Cassette assembly A Golden Gate assembly of the cassettes was performed, using BsaI as the Type II restriction enzyme. Six PCR tubes (one for each RNA cassette design plus the protein cassette) composed of the elements shown in Table 1 were inserted in the thermocycler and the Golden Gate protocol was run (see “Supplementary material”). Table 1. Components for the cassette assembly of the Kill-switch. “Dest. Vector” stands for “Destination Vector” and it’s the plasmid that will act as the backbone of the assembly, which confers resistance to carbenicillin and also carriers a marker: pECO79, mCherry and pECO83, mVenus. There are two different constructs: the RNA cassette and the Protein cassette. For the RNA cassette, five different RNA cassettes were formed, as the P3 column is variable and consists of the part assembly of each of the tiles. Its P1 and P4 contain the promoter and terminator sequences, respectively. For the protein cassette, just one assembly was formed, as it matches all five RNA cassettes. Its columns contain DNA sequences for: P1, promoter; P2, ribosome binding site; P3, chloramphenicol resistance; P4; Csy4 and P5, terminator. Next, NEB® Turbo Competent E. coli (High Efficiency) cells were transformed with these constructs. In order to do so, 40 microliters of Turbo® cells were added to each of the PCR tubes containing the assemblies and the transformation protocol of the company was run on the thermocycler. Once finished, 150 microliters of SOC media were added to each tube and incubated in the incubator at 37 ºC and 225 rpm for one hour. The strains were then plated and selected on carbenicillin plates. They were left to grow overnight. Next, one colony from each design was piqued and liquid cultured in falcon tubes (5 ml LB media and 5 microliter of chloramphenicol). They were incubated overnight at 37 ºC and 225 rpm. One day after, the plasmid DNA was purified (Macherey-Nagel Inc.’s kit and protocol) and sent for sequencing. 15 IDT’s sequencing service confirmed the correct assembly of four of the designs. However, it was noticed that the RNA origami tile with Csy4 motifs in all three corners, 2H_iSPI_3CSY4, rendered an incorrect cassette assembly. The experiment was repeated for this construct, but the assembly was incorrect yet again. Therefore, the experiments proceeded forward without this design. 4. Multi-gene cassette assembly A Golden Gate multi-gene assembly of the cassettes was performed, using Esp3I as the Type II restriction enzyme and pECO84 as the backbone. pECO84 contains the IPTGinducible standard promoter J23100. Four PCR tubes with the specified mixture (Table 2) were inserted in the thermocycler and the Golden Gate protocol was run (see “Supplementary material”). Table 2. Components for the multi-gene cassette assembly of the Kill-switch. “Dest. Vector” stands for “Destination Vector” and it’s the plasmid that will act as the backbone of the assembly, which confers resistance to chloramphenicol and also carriers a marker: pECO84, mVenus. The rest of the columns (P1-2) are the DNA sequences, inserts, assembled into the backbone. The P1 element is variable and consists of the cassette assembly of each of the tiles. The P2 element, the protein cassette, is the same for all the cassettes. Next, NEB® Turbo Competent E. coli (High Efficiency) cells were transformed with these constructs. In order to do so, 40 microliters of Turbo® cells were added to each of the PCR tubes containing the assemblies and the transformation protocol of the company was run on the thermocycler. Once finished, 150 microliters of SOC media were added to each tube and incubated in the incubator at 37ºC and 225 rpm for one hour. The strains were then plated and selected on chloramphenicol plates. They were left to grow overnight. Next, one colony from each design was piqued and liquid cultured in falcon tubes (5ml LB media and 5 microliters of chloramphenicol). They were incubated overnight at 37ºC and 225 rpm. One day after, the plasmid DNA was purified (Macherey-Nagel Inc.’s kit and protocol) and sent for sequencing. IDT’s sequencing service confirmed the correct assembly of the four designs. Then, JM109(DE3) cells were transformed with these constructs. In order to do so, 40 microliters of JM cells were added to each of the PCR tubes containing the assemblies and the transformation protocol of the company was run on the thermocycler. Once finished, 150 microliters of SOC media were added to each tube and incubated in the incubator at 37 ºC and 225 rpm for one hour. The strains were then plated and selected on chloramphenicol plates. They were left to grow overnight. 5. Flow cytometry First, each well of the first two rows of a 72-well plate was filled with 300 microliters of LB media and 0.3 microliters of chloramphenicol (pipetted from a master mix of 10 ml LB and 10 microliters of chloramphenicol). 16 Three colonies were piqued from the plates of each design and placed each in a well of the first row. A membrane was fixed over the plate, allowing oxygen to pass through, but preventing contamination. The plate was incubated in the heat block overnight at 37 ºC and 900 rpm. Next day, 6 microliters from each well in the first row were added to their respective wells in the second row (both wells, donor and acceptor, sharing the same column). The plate was left to incubate in the heat block for 2 hours at 37 ºC and 900 rpm. Next, half of the second row’s liquid content was transferred to the third row. To both second and third row, the iSpinach’s fluorophore was added, DFHBI (10 μM final concentration), and to the third row only, 0.3 microliters of IPTG (1 M) were added in each well, in order to induce the expression of the constructs. Again, the plate was left to incubate in the heat block this time for 4 hours at 37 ºC and 900 rpm. Afterwards, the flow cytometry was performed, and the fluorescent emission data was recorded. 3.2.2 Negative feedback loop 3.2.2.1 Design of RNA constructs Five different RNA constructs for cell transformation were designed following the sequence structure showcased in Figure 7. The Csy4 binding motifs were designed after those that Sternberg et al. employed in order to analyze the binding affinities of dCsy4 to several modified motifs (Figure 8) (Sternberg et al., 2012). The RNA constructs, as well as the primers for mutation and amplification, were designed and assembled in the Benchling™ web tool. Figure 7. Schematic representation of the RNA constructs’ elements and the mechanism behind the negative feedback loop. From left to right in the DNA scheme (bottom): black arrow, IPTG inducible promoter; brown half-circle, pET3a ribosome binding site (RBS); AUG box, AUG start codon; pink stem-loop, one of the five different stem-loop designs for the Csy4 motif; colored arrow, gene for the deactivated Csy4 (dCsy4); black T, terminator. After transcription, the mRNA of the dCsy4 can follow two routes. First, it can be translated into dCsy4. Second, the already existing dCsy4 can bind to its motif (in the mRNA), thus blocking the translation of more dCsy4, and generating a negative feedback loop. 17 Figure 8. Csy4 binding motifs used in the design of the negative feedback loop. Extracted and adapted from (Sternberg et al., 2012). 3.2.2.2 Synthesis of RNA structures 1. Sequence synthesis The five designed DNA sequences were ordered from Integrated DNA Technologies Inc. They were synthesized by using the chemical synthesis of oligonucleotides with the phosphoramidite approach. The plasmid pBP120 was used as a template and the designed primers were used to mutate (add the specific Csy4 motif for each design) and amplify the DNA construct. They were amplified in a thermocycler by PCR, which used primers that targeted our designed sequences and also added the prefixes and suffixes later needed for the Golden Gate assembly. 2. Part assembly For the part assembly, the five different PCR products were assembled into an entry vector, pECO83. The method used was the Golden Gate, and five PCR tubes with the mixture specified by the protocol were inserted in the thermocycler and the Golden Gate protocol was run (see “Supplementary material”). 3. Cassette assembly A Golden Gate assembly of the cassettes was performed, using BsaI as the Type II restriction enzyme. Five PCR tubes with the specified mixture (Table 3) were inserted in the thermocycler and the Golden Gate protocol was run (see “Supplementary material”). Table 3. Components for the cassette assembly of the Negative feedback loop. “Dest. Vector” stands for “Destination Vector” and it’s the plasmid that will act as the backbone of the assembly, which confers resistance to carbenicillin and also carriers a marker. For the pECO83, this marker is mVenus. The rest of the columns (P1-5) are the DNA sequences, which contain: P1, promoter; P2, Chloramphenicol resistance; P3, dCsy4 and P4, terminator. The P2 column is variable and consists of the ribosome binding site together with the part assembly of each of the DNA constructs with a different Csy4 motif. Next, NEB® Turbo Competent E. coli (High Efficiency) cells were transformed with these constructs. In order to do so, 40 microliters of Turbo® cells were added to each of 18 the PCR tubes containing the assemblies and the transformation protocol of the company was run on the thermocycler. Once finished, 150 microliters of SOC media were added to each tube and incubated in the incubator at 37ºC and 225 rpm for one hour. The strains were then plated and selected on carbenicillin plates. They were left to grow overnight. Next, one colony from each design was piqued and liquid cultured in falcon tubes (5ml LB media and 5 microliter of chloramphenicol). They were incubated overnight at 37ºC and 225 rpm. One day after, the plasmid DNA was purified (Macherey-Nagel Inc.’s kit and protocol) and sent for sequencing. IDT’s sequencing service confirmed the incorrect assembly of 5 of the designs. The cassette assembly was repeated following the same procedure, yet the results were still unsatisfactory. Therefore, we made use of the gBlock Gene Fragment service from Integrated DNA Technologies Inc. to generate the initial mRNA parts. Then, rerunning the experiment generated the cassette assembly with the correct sequence. 4. Multi-gene cassette assembly A Golden Gate multi-gene assembly of the cassettes was performed, using Esp3I as the Type II restriction enzyme and pECO84 as the backbone. pECO84 contains the IPTGinducible standard promoter J23100. Five PCR tubes with the specified mixture (Table 4) were inserted in the thermocycler and the Golden Gate protocol was run (see “Supplementary material”). Table 4. Components for the multi-gene cassette assembly of the Kill-switch. “Dest. Vector” stands for “Destination Vector” and it’s the plasmid that will act as the backbone of the assembly, which confers resistance to chloramphenicol and also carriers a marker: pECO84, mVenus. The rest of the columns (P1-2) are the DNA sequences, inserts, assembled into the backbone. The P1 element is constant and consists of the pMN318 plasmid, which contains the dCsy4 protein cassette. The P2 element is variable and consists of the cassette assembly of each of the tiles. Next, NEB® Turbo Competent Escherichia coli (High Efficiency) cells were transformed with these constructs. In order to do so, 40 microliters of Turbo® cells were added to each of the PCR tubes containing the assemblies and the transformation protocol of the company was run on the thermocycler. Once finished, 150 microliters of SOC media were added to each tube and incubated in the incubator at 37ºC and 225 rpm for one hour. The strains were then plated and selected on chloramphenicol plates. They were left to grow overnight. Next, one colony from each design was piqued and liquid cultured in falcon tubes (5ml LB media and 5 microliters of chloramphenicol). They were incubated overnight at 37ºC and 225 rpm. One day after, the plasmid DNA was purified (Macherey-Nagel Inc.’s kit and protocol) and sent for sequencing. IDT’s sequencing service showed the incorrect assembly of the five designs. Due to the corona restrictions, this part of the experiment was ended here. 19 4 Results and discussion 4.1 Kill-switch The expression level of the Csy4 was measured with the help of mScarlet-I-H, a small red fluorescent protein co-expressed with the enzyme. Seemingly, the expression and correct folding of the RNA origami was measured with the iSpinach-H aptamer (inserted in the origami), which binds to the DFHBI green fluorescence molecule. Based on our initial hypothesis, the mScarlet-I-H fluorescence, that is, the Csy4 expression should be equal for all the constructs in the induced and uninduced experiments, respectively. This is because its expression only depends on the promoter and its strength, and this was common for all constructs. From the data shown in Figure 9 and Table 5, we conclude that the ratios of Csy4 expression between the induced and uninduced samples were significant in all the constructs, therefore the inducible promoter worked appropriately. Figure 9. mScarlet-I-H fluorescence arbitrary values (A.U.) obtained in the flow cytometry. The results for the four different tile designs are displayed as a mean of the three colonies each construct had. For each, the blue bar on the left corresponds to the uninduced sample while the orange bar on the right corresponds to the sample induced with IPTG. See original in Supplementary material, Figure S2. However, a clear difference can be spotted between the uninduced expression of M0 and M1 versus the one of M2 and M3, being the former one much higher (Figure 9). Regarding the induced expression, an increasing trend can be deduced, being M0 the sample with the lowest expression, then M1, followed by M2 and finally M3, with the highest expression value. These events are most surprising, since the induced and uninduced mScarlet-I-H fluorescence values among groups were expected to be equal. First, the unsuccessful constitutive expression specific to the M0 and M1 samples is hypothesized to have been caused by errors during the DNA construct assembly process. Therefore, when extracting conclusions, a uniform and constant distribution of uninduced 20 expression will be assumed for the four constructs. Second, this experimental error has also affected the mScarlet-I-H ratios in Table 5, which will be assumed to have grown linearly from M0 to M3. Table 5. Ratios by construct and fluorescence type of induced (numerator) and uninduced (denominator) samples. Now, in order to explain the fluorescence differences among induced groups, a new theory was hypothesized. It is thought that some RNA origamis misfolded, forming aggregates that compromised cell viability, thus reducing the amount of fluorescence of the cell population. This effect was not appreciated in the uninduced groups, due to a lower expression yield of the RNA origamis. Thus, M3 is speculated to show the expected red fluorescence value for the induced groups (very little or null cellular burden), so its construct is the one showing the best folding, closely followed by M2. On the contrary, M0 and M1 would show clear effects of aggregation. Since all the constructs share most of the components, including the inducible promoter, and the construct-specific sequences are of similar length, the differences in folding could be mainly accounted for the different spatial positioning of the Csy4 binding motifs inside the RNA origami. To analyze this effect, we will integrate this data with the measurements of the iSpinach flow cytometry (Figure 10), since this is an indicator of the Csy4 cleaving activity, which could potentially play an important role in preventing aggregation. This is because, once cleaved, the RNA origami is likely to be degraded and cause no harm, nor accumulate. iSpinach-H is the aptamer that binds and allows DFHBI to emit green fluorescence. The iSpinach-H motif is shared by all the constructs. Basically, when the RNA origami is intact, the aptamer can bind to DFHBI and allow it to emit green fluorescence. As the origami is cleaved by Csy4 and degraded, the green fluorescence is expected to decrease. From the data shown in Figure 10, we conclude that the ratios of the RNA origami expression between the induced and uninduced samples was significant, therefore the inducible promoter worked appropriately. A uniform and constant distribution of uninduced expression can also be observed along the four constructs. 21 Figure 10. iSpinach-H fluorescence arbitrary values (A.U.) obtained in the flow cytometry. The results for the four different tile designs are displayed as a mean of the three colonies each construct had. For each, the blue bar on the left corresponds to the uninduced sample while the orange bar on the right corresponds to the sample induced with IPTG. See original in Supplementary material, Figure S3. With regards to the induced RNA origami expression and correct folding (Figure 10), an increasing trend can be deduced, being M0 the sample with the lowest expression, then M1, followed by M2 and finally M3. It is remarkable that this trend is highly similar to the one mScarlet-I-H showed in Figure 9. Integrating the data from Figure 9 and Figure 10, it is believed that the reason why they both follow a similar trend could be explained by fluorescent leakage. More specifically, looking at the relative fluorescent values, those of mScarlet-I-H are significantly bigger than those of iSpinach-H. Therefore, it was hypothesized that in Figure 10, it is actually the leakage of the mScarlet-I-H red fluorescence that plays a major role in determining the green fluorescence seen in each sample. In order to support that hypothesis, an analysis of the excitation and emission ranges and values of both fluorescent molecules was performed (Figure 11). Focusing on the green fluorescence detection value (X abs), there is indeed a possibility that the fluorescence emitted by mScarlet-I-H could be interfering with it. Figure 11 also allows us to affirm there is no possible interference in the red fluorescence detection value, so we can deduce that the whole signal detected in Figure 10 corresponds to mScarlet-I-H. 22 Figure 11. Excitation and emission spectra of two fluorescent molecules, integrated with the laser characteristics of the flow cytometer. Spectra (Y axis) are shown as the percent of the maximum excitation and emission fluorescence. The wavelength is portrayed in the X axis (in nanometers). Excitation spectra are represented by the peak on the left, while emission spectra, by the peak on the right. The numbers show the exact value of the peaks in nanometers. Seemingly, below the X axis, the number on the left shows the flow cytometer’s channels’ excitation value (Ex), while the one on the right shows the interval of emission values the detectors captured (Em) as a central number followed by the half-length of the interval. The two fluorophores are: (A) mScarlet-I-H, excitation (orange) and emission (red). Modified from (Song, Strack, Svensen, & Jaffrey, 2014). (B) Spinach2–fluorophore (DFHBI), excitation (dotted line) and emission (solid line). Modified from (Bindels et al., 2016). Therefore, taking into account that the green fluorescence detection along constructs (Figure 10) follows a highly similar trend to the one by the red fluorescence detection (Figure 9), and that the fluorescence output of the former is significantly bigger, it is concluded that the fluorescence emission of iSpinach-H has been either very low or null. Since this phenomenon occurs along the four constructs, the lack of iSpinach-H and DFHBI derived fluorescence could be explained by a general tendency for the backbone of the construct to misfold and make the iSpinach-H sequence inaccessible to DFHBI. This is consistent with our previous hypothesis. The Csy4 catalytic activity can still be appreciated thanks to the fact that the values for the mScarlet-I-H fluorescence of the induced samples vary greatly among constructs. Nevertheless, there is one unexplained difference in trend between Figure 9 and Figure 10, which lies within the uninduced M0 and M1 constructs. While in Figure 9, their red fluorescence was almost null (which led us to think an assembly error could be the cause), 23 in Figure 10, their values of green fluorescence are constant and uniform to the rest of the uninduced construct population Therefore, this could be an indicator that, even if very small quantities, there was some green fluorescence coming from the iSpinach-H-DFHBI complex. This effect was more clearly shown in the M0 and M1 uninduced constructs, so it could indicate that having no Csy4 motifs, or having one in position number 1, allows for a better folding. This improved folding, however, can only be noticed among the uninduced samples. This leads us to believe that when the amount of RNA origami expressed increases, the chances of misfolding and aggregation does as well. Taking everything into account, the following hypothesis were stated. First, it is believed that the constructs achieve a similar misfolded state. Second, the binding affinity of each motif to Csy4 is the same: it is the spatial arrangement of the RNA origami that affects Csy4 binding and cleavage. Third, this means that in the misfolded state, the locations of the motifs from M3 and M2 are exposed and accessible to the Csy4, while the one for M1 remains inaccessible. Fourth, the exposed motifs are cleaved, and the RNA origami degraded, avoiding accumulation of aggregates, reducing cell burden and allowing mScarlet-I-H expression to be adequate. Now we will look at the cleaving mechanism of Csy4, and test if our hypothesis is consistent. Csy4 only cleaves the RNA in one spot (Figure 12), therefore the cleaved motifs will remain bound to the scaffold. Figure 12. Csy4 cleaves within pre-crRNA repeat sequences (black) to generate mature crRNAs that contain a spacer sequence (colored line) flanked by fragments of the repeat. The substrate sequence and cleavage site (red triangle) are indicated above. Extracted from (Sternberg et al., 2012). What is more, Csy4 lacks the ability to engage in multiple-turnover catalysis, since it remains bound to the product. Sternberg et al. confirmed this by showing that the overall yield of the cleavage reaction remained directly proportional to the Csy4 concentration when present in sub-stoichiometric amounts relative to the substrate, even with incubation times 200-fold longer than the reaction time constant. Even if the Csy4 remains bound, the cleavage allows endonucleases to start degrading the RNA origami, therefore our hypothesis is still consistent. In order to test it, as well as our initial hypothesis, the experiment design should be improved and repeated. 24 With this aim, two new objectives are set for the new experiment: allowing the correct folding of the RNA origami and improving the cleaving conditions for the Csy4. Csy4 optimal cleaving conditions A high improvement in the Csy4 cleaving will eliminate the variation that could stem from partial cleaving, and make it so the differences observed in iSpinach-H fluorescence be mainly due to the spatial location of the motifs. Sternberg et al. performed some experiments to determine which is the most optimal substrate-enzyme ratio for Csy4. It was shown to be the 2:1 ratio, while the 1:1 ratio leads to approximately half of the RNA substrate being cleaved. Sternberg et al. were surprised by this fact, and stated it could be due to partial specific activity of the purified enzyme. With regards to our project, it is hypothesized that choosing a 1:1 ratio has also decreased binding efficiency and cleavage, so a 2:1 would be more likely to show the differences between constructs. RNA origami design Another factor to improve is the RNA origami design. This should prevent any misfolding from occurring, so the iSpinach-H will be exposed and able to bind DFHBI. Two steps are proposed: 1. Remove the iSpinach-H aptamer from the original position. 2. Add several iSpinach-H aptamers close to the Csy4 cleaving motifs. In our base design, the iSpinach-H aptamer is thought to play a destabilizing role in that position, and base on our conclusion, it also ends up being inaccessible to its fluorophore. Therefore, it is suggested that a new design is made. This design should be based on a bigger RNA origami scaffold, where it is possible to locate Csy4 cleaving motifs in all four corners, and an iSpinach-H aptamer next to two of them. A design based again on the 2H-AE tile is shown in Figure 13, mainly for explanatory purposes. From the experiment it was concluded that M3 and M2 (positions number 2 and 3, respectively) showed the best cleaving efficiency. Therefore, the new iSpinach-H sequences will find their location there. Figure 13. 2D blueprint of the new RNA origami tile designs. The four corners for Csy4 motif insertion are numbered from 1 to 4 (white numbers). The design contains the Csy4 motifs (yellow) inserted simultaneously in the four possible corners, and an iSpinach-H aptamer sequence (pink) next to two of them. 25 From the base design Figure 13 provides, the six different constructs can be inferred. They will all will have the iSpinach-H sequences, but the Csy4 motifs will vary: • • • • • • N0, no Csy4 motifs. N1, Csy4 motif on corner position number 1 (see Figure 13). N2, Csy4 motif on corner position number 2. N3, Csy4 motif on corner position number 3. N4, Csy4 motif on corner position number 4. N5, Csy4 motifs on the four corners. N0 will act as a control for maximum fluorescence and correct folding, given it has no cleaving sites. N5 will act as a minimum fluorescence control, given it has all the possible cutting sites. For this former construct, the amount of Csy4 co-expressed will have to be increased, due to the already mentioned inability of Csy4 to perform multiple-turnover catalysis. In N2 and N3, the iSpinach-H aptamer close to them will be a clear indicator of the cleaving effects, while the other iSpinach-H will provide a control-like function, given it is shared among all the constructs. The fluorescence of the first is expected to decrease, while the fluorescence of the former should be maintained. In N1 and N4, the effects of RNA origami degradation after Csy4 cleavage will be tested. Our hypothesis is that the cleavage in these constructs will not directly affect iSpinach fluorescence, but this will decrease with time as the origami is degraded. Applications The building of a programmable switch that can lead to cell death has multiple applications in the area of the biosciences. In fact, employing it as a kill-switch is just one of the many possibilities. To start with, assembled multidimensional RNA structures have been used as scaffolds for the spatial organization of bacterial metabolism (Delebecque, Lindner, Silver, & Aldaye, 2011). Delebecque et al. assembled engineered RNA modules into onedimensional and two-dimensional scaffolds with different protein-docking sites and used them to control the spatial organization of a hydrogen-producing pathway. Thus, rationally designed RNA assemblies can be used to construct functional architectures in vivo. With this approach, not only spatial, but also time (sequential) control would be a possibility. This could be achieved by incorporating aptamers, ribozymes or mRNA between the Csy4 cutting motifs in the RNA origami. These motifs, depending on their affinity, will be sequentially cut and liberated by the Csy4 enzyme. As the Csy4 concentration increases, the cleavage of the motifs with the smaller binding affinity will be a more likely event. Sequence order could also be controlled by employing different RNA cutting endonucleases (in sequence) with their respective motifs, or alternatively, by modifying the Csy4 and generating variants with significantly different binding motifs. Being able to control the sequential liberation of functional RNA constructs has several potential applications: 26 • • • • • • Metabolic engineering of enzymatic routes. The RNA origami could contain the mRNAs to codify for different enzymes, or for sequences that perform stranddisplacement in their target mRNA and allow/repress translation. Therefore, the expression of the enzymes that take part in a metabolic pathway could be regulated and coordinated. It would then be possible to express synthetic enzymes that pose a burden to the cells, and reduce their negative effects by controlling their window of expression. Building logical gates and circuits. In a RNA origami, the output of one step can be the input of the next one, but to allow for a more precise sequential control, the following mechanism can be implemented: imagine there are three sequential outputs/steps that need to happen in a specific order, being the first one the cleavage by Csy4. In second step, a ribozyme that cleaves the Csy4 binding motifs is released as a secondary output. This will prevent Csy4 from cleaving the residual motifs (thus restarting the 1st step) when the 3rd step is taking place. What is more, the RNA origami could self-regulate its expression via a feedback loop, such as the negative feedback loop proposed in this project. Providing a multicomponent ordered response to an input. All the different liberated outputs could be independent inputs that altogether lead the cell to trigger a complex process. An example could be apoptosis, and the fact that sequential logic gates can be possible inside the RNA allow for the potential design of a kill-switch with a timer. Simultaneously making use of the different functions RNA sequences can perform: cleaving, binding, sequestering, strand-displacement. Provide a safer approach than using DNA in the nucleus. Viable in the cytoplasm. Taking everything into account, the RNA origami would be similar to a microchip that has a program encoded in it. It can be directly expressed inside the cells and can use cellular components as inputs. There is a rising need to finding ways to prevent genetically modified bacteria from spreading into the wider environment, where they might grow and cause harm. The kill-switch approach proposes modifying the cells so that in order to survive, they will need to be a provided by a compound that cannot be synthesized or found in nature. Another option would be to incorporate logical gates into the organisms, such as sets of modular transcription factors that contain separate domains for sensing small molecules (the inputs) and for, in accordance, regulating gene expression. Therefore, the small molecule inputs are linked to the control of a specific promoter for gene expression. If the right inputs are not present, a toxin will be expressed and the cell will die (Chan, Lee, Cameron, Bashor, & Collins, 2016). This could also find an application in intellectual property protection, as unauthorized growth of strains without the appropriate passcode molecules would induce cell death (Chan et al., 2016). A timer could be included in a kill-switch. This would be done by inserting sequential steps of a certain time length between the input and the output, being the former the apoptosis signal. 27 4.2 Negative feedback loop We were unable to obtain results of our own, therefore we will discuss our hypothesis and integrate it with the existing bibliography. Csy4 binding mechanism Regarding the stem-loop recognized by Csy4, a longer stem decreases binding to Csy4, which means that one should add single-stranded regions with bulges (Sternberg et al., 2012). However, this is not a necessarily negative feature when building a negative loop, since different binding affinities will lead to different behaviors of the loop. Our aim with the experiment was to characterize the different negative feedback loop models in vivo. Our hypothesis was based on Sternberg et al.’s results for the binding affinities of the different Csy4 motifs (Figure 14), and stated that, since the binding affinity varies across motifs, the self-regulating effect dCsy4 causes would vary accordingly. Figure 14. Binding data and cleavage site mapping for base-pair insertion constructs. RNA substrates containing base-pair insertions below the terminal C–G base pair were generated (left), and electrophoretic mobility shift assays were performed with Csy4-H29A (right). Binding defects were mildest for one or two A–U base-pair insertions (~1- and ~10-fold) and increased to ~200and ~800-fold for one or two G–C base pairs, respectively. The cleavage site for each RNA substrate is indicated with a red triangle. Extracted from (Sternberg et al., 2012). Sternberg et al. also concluded that Csy4 recognizes a crRNA repeat containing either a GUAUA or a GUGUA loop. This is expected because both are likely to adopt GNR(N)A penta-loop folds. GNRA is a tetranucleotide sequence usually found capping hairpin stems (N is any base and R is a purine) (Dascenzo, Leonarski, Vicens, & Auffinger, 2017). GNRNA is its pentanucleotide sequence version. Characterization of the negative feedback loop The negative feedback loop is expected to oscillate: initially, the low levels of dCsy4 will allow its own mRNA to undergo translation, then a point will arrive where the dCsy4 are high enough to block its own translation. This is because the dCsy4 binds to its own mRNA, making the RBS inaccessible for the ribosome. Then, dCsy4 levels will decrease once more, allowing for more translation of dCsy4 and continuing the cycle. This is 28 expected to eventually achieve a balance in medium expression level, as in a damped oscillation, shown in Figure 15. Figure 15. Schematic representation of the damped oscillation mechanism caused by the negative feedback loop that describes the variation of expression levels for dCsy4. The Y axis is a relative and hypothetical measure of dCsy4 expression, while the X axis represents time. Initially, the levels of dCsy4 are low, so its translation will be unblocked (1st maximum, see unblocked dCsy4 mRNA). Then, as the dCsy4 concentration increases, it will block its own translation, leading to a decrease in concentration (1st minimum, see dCsy4 bound to the stem-loop in its mRNA). When the dCsy4 levels are low enough, translation will restart and the oscillation will be repeated. However, it is expected that eventually blocked and unblocked dCsy4 mRNAs will achieve an equilibrium, so the oscillations will be smaller as time goes by (see coexisting blocked and unblocked dCsy4 mRNAs). This oscillator mechanism can be altered by including a new element in the system: an RNA origami with binding motifs to Csy4. The goal of this RNA origami is to compete with the mRNA of the dCsy4. Thus, when the RNA origami is expressed, the dCsy4 will be sequestered by it, so it won’t bind its own mRNA nor prevent its translation. As the levels of RNA origami increase, the maximum dCsy4 levels will increase as well. However, eventually the dCsy4 levels will be high enough to saturate the RNA origami, and start binding to its own mRNA once more. This will lead to a stabilization in a high level of dCsy4 expression (see Figure 16). 29 Figure 16. Schematic representation of the dCsy4 sequestering and stabilizing mechanism of the RNA origami. The Y axis is a relative and hypothetical measure of dCsy4 expression, while the X axis represents time. The RNA origami is represented by a trapezoid, with dCsy4 bound to it. Initially, the levels of dCsy4 start growing, as the RNA origami will sequester the expressed dCsy4, thus preventing its negative self-regulation. When the levels are high enough for dCsy4 to leak (RNA origami are saturated) and block its own translation, an equilibrium will be achieved in the dCsy4 expression. Applications A process that alternates between activation and inhibition allows for a finer control, in this case, of enzymatic levels. In this negative feedback loop, the activated state is the base one, and it is the dCsy4 that acts as the inhibiting mechanism. Negative feedback oscillators exist in nature. Two examples are the circadian gene expression in some cyanobacteria and the cyclic binding of cofactors to the estrogensensitive pS2 promoter. In the former one, a coordinated sequence of binding and unbinding events modifies the DNA packing and nucleosome structure to enable transcription to proceed (Pigolotti, Krishna, & Jensen, 2007). In a broader sense, this autonomous and clock-like regulatory circuit promotes repeatability and reduces variation. In synthetic biology, these are two important features when designing logical circuits, and especially, when programming them to act in a synchronic or cyclical fashion. Therefore, it could allow a new type of input to be used when programming life: time. Had the experiment resulted successful, the specific features and parameters of the oscillator could have been analyzed by algorithms such as the one proposed by Pigolotti et al. (Pigolotti et al., 2007). This algorithm investigates the oscillating pattern by dividing the data set into intervals whose ends are determined by the occurrence of an extremal value of a variable. In our case, the oscillating pattern is seen in the dCsy4 expression and the external variable would be the amount of dCsy4 expressed. 30 5 Conclusion Kill-switch The fluorescence of mScarlet-I-H is directly proportional to its expression and therefore, that of Csy4. These have both been affected by the incorrect folding of the RNA origami, which leads to the formation of aggregates. These aggregates, when in high numbers, promote the loss of fitness (reduced growth speed and protein production), which is why the expression of Csy4 and mScarlet-I-H is shown to have diminished in some constructs (M0 and M1). All the constructs are thought to have misfolded, since none had the iSpinach aptamer accessible to DFHBI. However, depending on the location of the Csy4 motif, some of them remained accessible to Csy4, which cleaved them, promoting the degradation of the RNA origami and preventing aggregation. Thus, these constructs (M3 and M2) rendered highest mScarlet fluorescence. For future experiments, the RNA tiles will be redesigned to allow the correct folding of the RNA origami, as well as the accessibility of the functional motifs. What is more, optimal cleavage conditions for Csy4 will be sought, by, for example, increasing the substrate-enzyme ratio to 2:1. All in all, rationally designed RNA assemblies can be used to construct functional architectures in vivo. With this approach, not only spatial, but also time (sequential) control would be a possibility. This could be achieved by incorporating aptamers, ribozymes or mRNA between the Csy4 cutting motifs. Therefore, these RNA origamis could form logical gates, allowing the triggering of complex cell processes, such as the induction of apoptosis (kill-switch). Negative feedback loop Unfortunately, the experimental procedure didn’t render any viable results. The experimental error that led to this is believed to be an incorrect Golden Gate assembly for the multi-gene cassettes. Therefore, this assembly should be performed once more, and if the desired result is not achieved, the design of the constructs should be questioned. Once the experimental errors are corrected, the protocol will be run once more. The expected result will be a negative feedback loop, which will behave as a damped oscillator. The oscillating pattern would be the variation of dCsy4 expression with time, while the external explanatory variable would be the free levels of dCsy4. This is because the dCsy4 binds to its own mRNA, near the RBS, thus preventing the ribosome from translating. An extra factor that can set a difference between the dCsy4 expressed and free levels is the expression of RNA origamis containing dCsy4 binding motifs. Without them, the oscillator will eventually stabilize in medium dCsy4 expression levels. In their presence, however, more dCsy4 will be able to be expressed before triggering the negative regulation mechanism, thus achieving an equilibrium in higher levels of dCsy4 expression. In conclusion, the designed clock-like and autonomous regulatory circuit is a valuable element for synthetic biology. By making time a new input, not only could computational operations in living systems be controlled by means of space, but also by time. This circuit 31 has the potential to change its oscillatory pattern (RNA origami), act in a synchronized manner, be delayed or become cyclical. 6 References Adan, A., Alizada, G., Kiraz, Y., Baran, Y., & Nalbant, A. (2017). Flow cytometry: basic principles and applications. Critical Reviews in Biotechnology. https://doi.org/10.3109/07388551.2015.1128876 Bindels, D. S., Haarbosch, L., Van Weeren, L., Postma, M., Wiese, K. E., Mastop, M., … Gadella, T. W. J. (2016). MScarlet: A bright monomeric red fluorescent protein for cellular imaging. Nature Methods. https://doi.org/10.1038/nmeth.4074 Chan, C. T. Y., Lee, J. W., Cameron, D. E., Bashor, C. J., & Collins, J. J. (2016). “Deadman” and “Passcode” microbial kill switches for bacterial containment. Nature Chemical Biology. https://doi.org/10.1038/nchembio.1979 Chappell, J., Takahashi, M. K., & Lucks, J. B. (2015). Creating small transcription activating RNAs. Nature Chemical Biology. https://doi.org/10.1038/nchembio.1737 Chen, J. L., & Greider, C. W. (2005). Functional analysis of the pseudoknot structure in human telomerase RNA. Proceedings of the National Academy of Sciences of the United States of America. https://doi.org/10.1073/pnas.0502259102 Chen, Y., & Varani, G. (2010). RNA Structure. In Encyclopedia of Life Sciences. https://doi.org/10.1002/9780470015902.a0001339.pub2 Dascenzo, L., Leonarski, F., Vicens, Q., & Auffinger, P. (2017). Revisiting GNRA and UNCG folds: U-turns versus Z-turns in RNA hairpin loops. RNA. https://doi.org/10.1261/rna.059097.116 Delebecque, C. J., Lindner, A. B., Silver, P. A., & Aldaye, F. A. (2011). Organization of intracellular reactions with rationally designed RNA assemblies. Science. https://doi.org/10.1126/science.1206938 Dirks, R. M., Lin, M., Winfree, E., & Pierce, N. A. (2004). Paradigms for computational nucleic acid design. Nucleic Acids Research. https://doi.org/10.1093/nar/gkh291 Geary, C., Rothemund, P. W. K., & Andersen, E. S. (2014). A single-stranded architecture for cotranscriptional folding of RNA nanostructures. Science. https://doi.org/10.1126/science.1253920 Geary, C. W., & Andersen, E. S. (2014). Design principles for single-stranded RNA origami structures. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). https://doi.org/10.1007/978-3-319-11295-4_1 Isaacs, F. J., Dwyer, D. J., & Collins, J. J. (2006). RNA synthetic biology. Nature Biotechnology. https://doi.org/10.1038/nbt1208 Jalali, M., Zaborowska, J., & Jalali, M. (2017). The Polymerase Chain Reaction: PCR, qPCR, and RT-PCR. In Basic Science Methods for Clinical Researchers. https://doi.org/10.1016/B978-0-12-803077-6.00001-1 Jones, C. P., & Ferré-D’Amaré, A. R. (2015). RNA quaternary structure and global symmetry. Trends in Biochemical Sciences. 32 https://doi.org/10.1016/j.tibs.2015.02.004 Lorenz, R., Bernhart, S. H., Höner zu Siederdissen, C., Tafer, H., Flamm, C., Stadler, P. F., & Hofacker, I. L. (2011). ViennaRNA Package 2.0. Algorithms for Molecular Biology. https://doi.org/10.1186/1748-7188-6-26 Mallikaratchy, P. (2017). Evolution of complex target SELEX to identify aptamers against mammalian cell-surface antigens. Molecules. https://doi.org/10.3390/molecules22020215 Pigolotti, S., Krishna, S., & Jensen, M. H. (2007). Oscillation patterns in negative feedback loops. Proceedings of the National Academy of Sciences of the United States of America. https://doi.org/10.1073/pnas.0610759104 Rivas, E., & Eddy, S. R. (1999). A dynamic programming algorithm for RNA structure prediction including pseudoknots. Journal of Molecular Biology. https://doi.org/10.1006/jmbi.1998.2436 Rothemund, P. W. K. (2006). Folding DNA to create nanoscale shapes and patterns. Nature. https://doi.org/10.1038/nature04586 Saito, H., & Inoue, T. (2009). Synthetic biology with RNA motifs. International Journal of Biochemistry and Cell Biology. https://doi.org/10.1016/j.biocel.2008.08.017 Seeliger, J. C., Topp, S., Sogi, K. M., Previti, M. L., Gallivan, J. P., & Bertozzi, C. R. (2012). A Riboswitch-Based inducible gene expression system for mycobacteria. PLoS ONE. https://doi.org/10.1371/journal.pone.0029266 Soediono, B. (1989). Alberts - Molecular Biology Of The Cell 4th Ed. Journal of Chemical Information and Modeling. Song, W., Strack, R. L., Svensen, N., & Jaffrey, S. R. (2014). Plug-and-play fluorophores extend the spectral properties of spinach. Journal of the American Chemical Society. https://doi.org/10.1021/ja410819x Sparvath, S. L., Geary, C. W., & Andersen, E. S. (2017). Computer-aided design of RNA origami structures. In Methods in Molecular Biology. https://doi.org/10.1007/978-14939-6454-3_5 Sternberg, S. H., Haurwitz, R. E., & Doudna, J. A. (2012). Mechanism of substrate selection by a highly specific CRISPR endoribonuclease. RNA. https://doi.org/10.1261/rna.030882.111 Svoboda, P., & Di Cara, A. (2006). Hairpin RNA: A secondary structure of primary importance. Cellular and Molecular Life Sciences. https://doi.org/10.1007/s00018005-5558-5 Weber, E., Engler, C., Gruetzner, R., Werner, S., & Marillonnet, S. (2011). A modular cloning system for standardized assembly of multigene constructs. PLoS ONE. https://doi.org/10.1371/journal.pone.0016765 Westhof, E., & Auffinger, P. (2000). RNA Tertiary Structure. In Encyclopedia of Analytical Chemistry. https://doi.org/10.1002/9780470027318.a1428 Xayaphoummine, A., Bucher, T., & Isambert, H. (2005). Kinefold web server for RNA/DNA folding path and structure prediction including pseudoknots and knots. Nucleic Acids Research. https://doi.org/10.1093/nar/gki447 33 Zadeh, J. N., Steenberg, C. D., Bois, J. S., Wolfe, B. R., Pierce, M. B., Khan, A. R., … Pierce, N. A. (2011). NUPACK: Analysis and design of nucleic acid systems. Journal of Computational Chemistry. https://doi.org/10.1002/jcc.21596 7 Supplementary material A B C 34 D E F 35 G H Figure S1. Flow-cytometry results, from left to right, cell count, singlet count, iSpinach-fluorescence and mScarlet fluorescence. (A) M0 induced, (B) M0 uninduced, (C) M1 induced, (D) M1 uninduced, (E) M2 induced, (F) M2 uninduced, (G) M3 induced and (H) M3 uninduced. Figure S2. mScarlet-I-H fluorescence arbitrary values (A.U.) obtained in the flow cytometry. The results for the four different tile designs, and for each colony are displayed. For each, the blue bar on the left corresponds to the uninduced sample while the orange bar on the right corresponds to the sample induced with IPTG. 36 Figure S3. iSpinach-H fluorescence arbitrary values (A.U.) obtained in the flow cytometry. The results for the four different tile designs, and for each colony, are displayed. For each, the blue bar on the left corresponds to the uninduced sample while the orange bar on the right corresponds to the sample induced with IPTG. 37 38 39 40