TM SIMDEQ – a novel approach to combined genetic & epigenetic analysis Jimmy Ouellet1, Fatima Hamouri1, Laurène Giraut2, Gordon Hamilton1,2, Charles André2, Jean-François Allemand1, David Bensimon1, Vincent Croquette1 1 Laboratoire de Physique Statistique, ENS, 24 rue Lhomond, 75005 Paris; 2 PicoSeq SAS, 74 rue Lecourbe, 75015, Paris ABSTRACT INTRODUCTION PicoSeq is developing a novel platform for the analysis of DNA and RNA in collaboration with the lab of Vincent Croquette at the Ecole Normal Superiéure in Paris. This platform, known as SIMDEQTM (short for Single-Molecule Magnetic DEtection & Quantification) interrogates individual DNA/RNA tethered to micron-sized paramagnetic beads. The DNA/RNA molecules are attached to the floor of a flow cell, and manipulated in a magnetic field. This approach, which is simple, accurate, and has the potential to be run at very high throughput, can be used to map and ultimately to fully sequence DNA or RNA molecules. SIMDEQ can also be used to directly detect base-modifications such as 5-methylcytosine as there is no need for sample amplification. We have chosen the human FMR1 gene as a model system to demonstrate the capabilities of the SIMDEQ system. The FMR1 locus (associated with Fragile X Syndrome and several other diseases) is difficult to characterize at the molecular level, due to its long GC-rich repeats, complex methylation patterns and mosaicism. We have developed specific analytical protocols to address the challenges presented by FMR1 and other similar loci. Here we present data from our SIMDEQ platform, that demonstrates robust and accurate genotyping and epityping on single-molecules of DNA. Specifically, we share some results from our analysis of the 3’ UTR of the FMR1 gene (a challenging highly GC-rich genomic region), and 3-methyl cytosine detection, a base-modification that can currently only be detected using low-resolution immuno-precipitation. In contrast to most current NGS methods, it is worth keeping in mind that our approach is not based on the detection of the incorporation of fluorescent nucleotides but on tracking the length of DNA hairpins with single-base high precision (see Figures 1A – 1C below). These hairpins can contain fragments of interest varying in length from a few nucleotides through to 20kb+. The SIMDEQTM bench-top prototype EXPERIMENTAL PRINCIPLE Open Hairpin 50 100 Extension (nM) Extension (nM) 100 Closed Hairpin 0 Time } 50 0 } 40nM Time Time Figure 1A. The DNA hairpin is the central structure underpinning the SIMDEQ approach, as it can be repeatedly opened and closed. DNA hairpins containing a region of interest are attached to the floor of a flow-cell by one arm and to a paramagnetic bead by the other. Using a magnet, moveable in the z-axis, a variable force can be applied to the beads. With the magnet in the upper position, a nominal force is exerted on each bead (left panel). Lowering the magnet increases the force pulling on the beads. When the force applied is greater than 1015pN, the DNA hairpins will “unzip” (middle panel). When the magnet moves back to the upper position, the hairpins will reform. This openclose process can be repeated many thousands of times. Importantly this entire process can be monitored by tracking the z-position of the beads in real time. This is shown in the right panel – here the bead begins in the closed position (extension = 0nM) and is subsequently opened (extension ~80nM) and allowed to close again. The length of the open hairpin allows the total length of the DNA molecule to be determined (in this case about 80 nucleotides). Figure 1B. Sequence information can be generated by blocking hairpin closure with hybridizing oligonucleotides. While hairpins are in their unzipped state, oligonucleotides are able to hybridize to their complementary sequences (as shown in orange in the left panel). When the force on the beads is reduced, the hairpins start to reform but bound oligonucleotides temporarily block the rezipping of the hairpin (center panel). These hybridization events are detectable as pauses in the movement the bead as the hairpin goes from open to closed (right panel). The duration of these pauses is largely dependent on the length of the oligonucleotides. Typically we use oligonucleotides that bind for an average of 0.5 – 2 sec. Our method for tracking bead zposition is very precise, allowing us to detect the binding position of each oligonucleotide with single-base precision. Thus for each pause we can determine both the underlying sequence and its position. Characterization of Fmr1 repeats: a challenging task is greatly simplified with SIMDEQ 40nM Figure 1C. A wide range of base-modifications can be detected by blocking hairpin closure with antibodies. As with oligonucleotides, antibodies specific to base-modifications are able to bind to their antigens when hairpins are in their open state, and block hairpin re-zipping. A wide range of antibodies (both monoclonal and polyclonal) against base-modifications are commercially available. We have tested a range of these antibodies and have shown that we can accurately determine the presence and location of many different base-modifications. Recording the binding kinetics of individual antibody binding events, allows us to accurately discriminate real from false-positive binding events. A wide range of base modifications can be detected with SIMDEQ Biotin B AGG repeats • Conserved Region Oligo 2 Dig Oligo 3 Variable Region Me 600 bp B Synthetic Oligos FMR1 gene CGG repeat region Oligo 1 400 bp 5-mC 5-hmC 5-caC 6-mA 3-mC Oligo 4 Figure 3 (A) A series of synthetic hairpins were generated, each containing a specific epigenetic modification. They were all constructed by ligating two fragments isolated from plasmid DNA (of 400 and 600 base-pairs in length, each end having a unique overhang), together with two synthetic oligonucleotides, one containing the desired modification. (B) Chemical structures of base modifications which can now be robustly detected using SIMDEQ. Below we provide data from the analysis of 3mC modifications. Conserved Region Sequencing and base modification detection can be performed on the same single molecules of DNA: an example of 3mC C Figure 2. Measurement of the number of CGG repeats in the 3’ UTR region of FMR1, and determination of the presence of interspersed AGG repeats. (A) Schematic representation of the human FMR1 gene. Specific FMR1 gene-mapping oligos (numbered 1-4) are spaced along the length of the gene in conserved regions flanking the repeats. The distance between oligos in the conserved region is used as a reference to determine the number of bases located between the two oligos directly flanking the repeats. (B) Left panel: mapping data from two hairpins with 23 CGG repeats (top) and 29 CGG repeats (bottom). The region surrounding the repeats was amplified from gDNA obtained from a mix of normal individuals and cloned into bacteria. Two clones were selected and repeats analyzed by Sanger sequencing (data not shown). Right panel: a repeat was sized as in the right panel with 4 mapping oligos (top) and subsequently with an oligo specific for interspersed AGG repeats. This mapping data was confirmed by Sanger sequencing (data not shown). (C) Analysis of a number of hairpins derived from bacterial colonies of repeats of 23 (blue) and 29 repeats (red) sized as described in panel A. The distribution of actual repeat sizes is centered on the expected size of the repeat, but shows considerable variation (+/-3 repeats). This is consistent with other reports of instability of these repeats when cloned into E. coli. (e.g. Ref 2). Oligo 1 3mC Ab In our analysis of the FMR1 locus with SIMDEQ, we demonstrate that complex and GC-rich repeats can be easily analyzed with a single, rapid hybridization experiment. Although the results generated showed some heterogeneity, we are confident that this is due to the instability of these repeats when cloned into E. coli and not from our analytical approach. Future experiments will examine FMR1 molecules isolated directly from human DNA. As well as being able to perform high-resolution genotyping, the SIMDEQ platform also allows users to map a very wide range of base-modifications on the same unamplified DNA molecules, without need for any conversion chemistries. Indeed, virtually any modified DNA structure can be investigated by simply generating a suitable binding molecule. Future work will focus on generating new binders for interesting targets such as additional methylated bases and various forms of DNA damage. CONCLUSIONS PicoSeq’s SIMDEQ™ platform enables: • Interrogation of DNA fragments ranging from a few base-pairs to 20kb+ • Rapid and accurate analysis of repetitive, GC-rich regions, such as the FMR1 locus. This approach can be easily expanded to other repetitive loci • The detection of a wide range of base-modifications on unamplified genomic DNA O1 O3 O2 Calculated position of 3mC = 612 bp Oligo 2 Oligo 3 Time DISCUSSION Cumulative binding CpG island A Extension (nm) A Extension (Base pairs) Figure 4. Detection of 3-methylcytosine (3mC) modifications using a commercial antibody. (A) A 1kb hairpin containing a single 3-methylcytosine modification (produced as described in Figure 3A) was analyzed with a polyclonal antibody for 3-methylcysoine (Diagenode, ref 3.) and 3 reference oligos. Opening/closing cycles were performed (as described in Figure 1C) and the bead position graphs for all cycles were then superimposed. Blocking positions for oligos are indicated by blue arrows, and the position of the antibody binding site is indicated by the orange arrow. Note that there are a few nonspecific binding events (infrequent and/or of short duration) which are probably due to antibodies in the polyclonal mix with poor specificity or affinity. These events can be easily filtered out of the analysis. (B) A histogram of the binding positions was produced from the opening/closing cycles of Figure 4A. Because the sequence of the oligonucleotides and their complementary sequences on the hairpin were known, the extension value in nm could be converted into base pairs using the oligos as reference points. The expected positions of the oligonucleotides are represented by the rectangular bars. Once aligned, it was possible to determine the position, in base pairs, of the blockage due to the antibody, which is in this case 612bp (expected position was 614bp). REFERENCES 1. Single-molecule mechanical identification and sequencing (2012) Ding F, Manosas M, Spiering MM, Benkovic SJ, Bensimon D, Allemand JF, Croquette V. Nat Methods. Mar 11;9(4):367-72 2. Sequencing the un-sequenceable: expanded CGG-repeat alleles of the Fragile X gene. (2013) Loomis EW1, Eid JS, Peluso P, Yin J, Hickey L, Rank D, McCalmon S, Hagerman RJ, Tassone F, Hagerman PJ. Genome Res. Jan;23(1):121-8 3. http://www.diagenode.com/media/catalog/file/Datasheet_3-mC_C15410209.pdf
© Copyright 2026 Paperzz