SUPPLEMENTARY INFORMATION Choice of Cell Types Used in This Study. To begin to understand transcription-associated looping interactions at the TAL1 locus, we studied three different cell types: (i) erythroid cells which express TAL1, (ii) lymphoid T-ALL cells which do not express TAL1, and lymphoid T-ALL cells which express TAL1. The human K562 cell line is a well-documented cell type which reflects the in vivo properties of the erythroid lineage, GATA1 occupancy at target genes,1 and for which we have previously published data on the activity of the TAL1 +51 erythroid enhancer.2,3 Using this cell type would also allow us to understand one of the key questions related to the complexity of the TAL1 locus – how does the TAL1 +51 erythroid enhancer communicate with its cognate promoters? Jurkat cells which express TAL1 in TALL (but which do not have a TAL1/STIL deletion) served as a well-characterized T-ALL cell type4 to understand looping interactions which may regulate inappropriate TAL1 expression during leukemogenesis. HPB-ALL cells, which are of T-ALL origin, but which do not express TAL1, served as a “control” cell type from which we could determine transcription-associated looping interactions in TAL1 expressing cell types (either K562 or Jurkat). To validate that the looping interactions we identified in human cell types, were also found in mouse, we chose an appropriate TAL1 expressing murine erythroid cell line (MEL) and a TAL1 non-expressing lymphoid cell line (BW5147). The regulatory features (histone modifications) of the Tal1 locus has been studied extensively in both of these cell types in our laboratory and showed similar characteristics to those of K562 and HPB-ALL.3 To confirm that the looping interactions we found in human and mouse cells reflected bona fide in vivo interactions in the erythroid and lymphoid lineages, we also examined looping in murine primary erythroblasts and lymphocytes. These two cell types were the obvious choices for these confirmatory studies. SUPPLEMENTARY FIGURES AND TABLES Supplementary Figure S1. Schematic diagrams of the organization of the human TAL1 regulon. A number of cis-regulatory elements have been characterized at the TAL1 locus. In addition to the TAL1 promoters 1a and 1b, these include the stem cell enhancer (designated +17/+18/+19 and +19/+20/+21 in mouse and human respectively, based on their distances in kb from promoter 1a), and the erythroid enhancer (+40 and +51 in mouse and human respectively). The stem cell and erythroid enhancers are believed to direct TAL1 expression in the hematopoietic stem cell5,6 and in the erythroid lineage7 respectively. More recent studies have demonstrated the presence of CTCF-bound elements at the TAL1 locus which display insulator enhancer-blocking or barrier activity in vitro or in vivo.2,8 Collectively, these regulatory elements span approximately 88 kb in human – a genomic region which contains the entire TAL1 “regulon” defined by CTCF-bound elements at both ends (at +57 and -31).2 In all panels, locations and directions of transcription of TAL1 and its flanking genes PDZK1IP1, STIL and CMPK1 are shown by the horizontal blue arrows. Locations of their cognate promoters are shown with vertical red arrows. Enhancers of the TAL1 gene studied here (the +19/+20/+21 stem cell enhancer, the +51 erythroid enhancer and the -10 enhancer) are shown with vertical green arrows. CTCF-bound elements studied here (+57, +53, +40 and -31) are shown with vertical blue arrows and labelled with CTCF in the blue hexagon. Scales (in kb) are shown. (A) The extent of the predicted regulon (horizontal black line with arrowheads) is 88 kb and defined by CTCF-bound elements (+57 and -31) at its extents.2 (B) The CTCF-bound element (+40) is juxtaposed between the TAL1 stem cell enhancer and the PDZK1IP1 promoter and thus prevents communication between these two elements (denoted by X) and impairs transcription (denoted by X). (C) The CTCF-bound element (+40) is juxtaposed between the TAL1 erythroid enhancer and the TAL1 promoters and thus prevents communication between these elements (denoted by X) and impairs transcription (denoted by X). (D) Transcription of the STIL gene is impaired (denoted by X) by the presence of the CTCF-bound element (-31) within its transcribed region. This is due to the inability of RNA polymerase II (Pol II) (brown oval) to transcribe through the region occupied by CTCF (denoted by X). In the scenarios shown in panels B, C and D, the impediments imposed to transcription are compatible with known roles of CTCF-bound insulator elements in preventing communication between regulatory elements by altering chromatin loop formation [loop domain model9] or by interfering with the movement of Pol II [tracking model10]. Supplementary Figure S2. mRNA expression levels of TAL1 and its flanking genes. (A).Bar diagram showing mRNA expression levels (log2) of human PDZK1IP1, TAL1, STIL and CMPK1 in K562 and HPB-ALL cell lines. (B) Bar diagram showing mRNA expression levels (log2) of murine Pdzk1ip1, Tal1 and Stil in MEL and BW5147 cell lines. All data are shown with standard error measurements. Expression levels were determined relative to housekeeping gene ACTB, the mRNA value of which was set at log2 = 16.6 (panel A) and log2 = 13.3 (panel B) (not shown in the bar diagrams). Supplementary Figure S3. Schematic flow diagram of the 3C procedure. 3C uses formaldehyde cross-linking (A) to covalently fix interacting chromatin segments to proteins at their sites of occupancy within nuclei of living cells. Subsequently, the crosslinked chromatin is digested using an appropriate restriction endonuclease (B), followed by intramolecular ligation of cross-linked chromatin fragments (C). The resulting 3C library contains a large number of ligation products, a proportion of which reflect interactions between non adjacent genomic regions which lie in close proximity within the nucleus. These interactions can be detected by PCR using oligonucleotide primer pairs (one primer originating from each of the regions which are to be tested) (D). The interaction frequency of two genomic regions is represented by the abundance of corresponding ligation product amplified by PCR – quantified by gel electrophoresis (E). The 3C libraries prepared for this study used a 4-bp cutting restriction endonuclease (Csp6I) which generates restriction fragments of 600 bp, on average, across both the human and mouse TAL1 loci – thus allowing us to detect differences in ligation frequencies at sufficient resolution to resolve between regulatory elements. Further details of library preparation can be found in Methods. Supplementary Figure S4. Looping interactions involving Tal1 promoters and enhancers in murine erythroid and lymphoid cell lines. (A) Schematic organization of the murine Tal1 locus. Locations and directions of transcription of TAL1 and its flanking genes Pdzk1ip1, Stil and Cmpk1 are shown (horizontal blue arrows). Locations of promoters (vertical red arrows) and enhancers of the Tal1 gene studied here (vertical green arrows) are shown. The erythroid and the stem cell enhancers are highlighted. Elements are named according to their distance (in kb) from Tal1 promoter 1a. Looping interactions tested in this study are denoted by dotted grey lines with arrowheads. (B) Bar diagram of interaction patterns across the murine Tal1 locus in erythroid (MEL) and lymphoid (BW5147) cell lines determined by 3C. Interactions, measured as relative ligation frequencies (black bars) at various locations across the locus, are shown with standard errors. Location of 3C “bait” region (Tal1 promoter 1b = PTal1) is shown (vertical red arrows). p values are indicated for relative ligation frequencies which are significantly higher for test regions when compared to those of control regions (controls defined as regions located between the “bait” and test regions). Scales (in kb) are shown at the bottom of panel B. (C) Comparison of interaction patterns at the TAL1 locus in MEL (TAL1 expressing) and BW5147 (TAL1 non-expressing) cells normalized against ERCC ligation frequencies. p values are indicated for interactions which are significantly higher in MEL cells. p < 0.01 = **; p < 0.001 = ***; p < 0.0001 = ****; p < 0.00001 = *****. Supplementary Figure S5. Assessment of 3C library quality in cell types used in this study. Bar diagrams show the interaction frequencies between two non co-linear Csp6I fragments at the ERCC3 locus determined by 3C. (A) Human K562, HPB-ALL and Jurkat cell lines. (B) Murine primary erythrocytes and lymphocytes. (C) Murine MEL and BW5147 cell lines. (D) K562 cells 48 hrs after transfection with siRNA for luciferase (LUC) or GATA1 (KDGATA1). (E) K562 cells 96 hrs after transfection with siRNA for luciferase (LUC) or GATA1 (KDGATA1). Interaction levels (grey bars), measured as relative ligation frequencies (black bars), are shown with standard errors and represent the mean from 2 bioreplicate samples. These ERCC3 data were used to normalize relative ligation frequencies for comparisons between cell types. Note: restriction enzyme digestion frequencies for these libraries varied between 73-92% (not shown here). These were measured by assessing digestion efficiency at a single Csp6I site at the TAL1 locus.11 Supplementary Figure S6. Effect of GATA1 siRNA on K562 cell growth, morphology and apoptosis. For all analyses described below, four conditions were analyzed at both 48 and 96 hr time points. Two conditions were used as negative controls: wild type K562 cells (WT) and K562 cells electroporated with water only (EP). The two test conditions were K562 cells electroporated with either luciferase siRNA (LUC) or GATA1 siRNA (GATA1). (A) Growth curve of viable cell numbers relative to input (0 hr) at both 48 and 96 hr. Three biological replicates were analyzed for each condition and viable cell counts assayed using a haemocytometer. (B) Bar diagram showing percentages of cells having more than one nuclei per cell. (C) Bar diagram showing percentages of cells having irregular bulges (blebs) in the plasma membrane. (D) Bar diagram showing percentages of cells showing immunofluorescence for annexin V (marker of apoptosis). For the analyses in panels B→D, three bioreplicates of 100 cells for each condition were examined by microscopy and scored for nuclei content, blebs and annexin V staining. All data are shown with standard error measurements. Significant differences of cellular structure, function and viability were only identified between wild type K562 cells and any of the conditions subjected to electroporation (electroporation with water; electroporation with luciferase siRNA and electroporation with GATA1 siRNA) at either the 48 hr or 96 hr time point. This confirmed that electroporation per se had the most significant detrimental effect on K562 cells. Introduction of GATA1 siRNA into K562 cells had, however, no significant detrimental effects on cellular functions and viability above levels detected in the electroporation with water control. Supplementary Figure S7. ENCODE ChIP-seq data across the human TAL1 locus. Publically available ENCODE12 ChIP-seq data for TAL1 (Snyder, Stanford; wgEncodeEH001824), GATA1 (Farnham, USC; wgEncodeEH000638), GATA2 (Farnham, USC; wgEncodeEH000683), CTCF (Snyder, Stanford; wgEncodeEH002797), RAD21 (Snyder, Stanford; wgEncodeEH000649), SMC3 (Snyder, Stanford; wgEncodeEH00184), and CTCFL (Myers-Hudson Alpha, wgEncodeEH001652) were visualized across the human TAL1 locus using the UCSC genome browser (http://genome.ucsc.edu/). Genes and their exon-intron structures are shown at the top of the figure. TAL1 regulatory elements are annotated at the bottom of the figure. Two CTCF peaks (57-1 and 57-2) are shown for the CTCF-binding element at +57 (see Supplementary Figure S11). Scale is shown and coordinates are for chromosome 1 (hg19). Note: we examined all publicly available ENCODE datasets for the K562 cell line to identify other features which may be unique to -31 and aid in our understanding of its function. The -31 element was the only insulator at the TAL1 locus which showed occupancy of the CTCF paralogue, CTCFL, in K562 cells. Furthermore, whilst -31 appeared to bind CTCF and RAD21 in a GATA1-dependent manner, GATA1 was not directly bound to it in K562 (see manuscript text), nor did it have conserved GATA1 motifs. Supplementary Figure S8. Schematic models showing all possible looping configurations involving CTCF and RAD21 occupied elements at the TAL1 locus in erythroid cells. (A) Interactions between the +40 and -31 elements result in the TAL1 promoters (red box) being placed in a loop containing the +19/+20/+21 stem cell enhancer (green box). However, the +51 erythroid enhancer (green box) is not contained within this loop as it lies distal to the +40 element. (B) Interactions between the +57 (or +53) and +40 elements place the erythroid enhancer in a chromatin loop. However, the TAL1 promoters and the stem cell enhancer are not within this loop. (C) A composite looping pattern containing the loops from A and B places the TAL1 promoters and the stem cell enhancer in a separate loop from that containing the erythroid enhancer. (D) Interactions between the +57 (or +53) and -31 elements place the TAL1 promoters and the stem cell and erythroid enhancers in the same chromatin loop, thus facilitating their communication through direct contact (green line with arrows). CTCF and RAD21 occupancy is shown in the colour key. Supplementary Figure S9. Looping interactions of CTCF/RAD21-bound elements at the TAL1 locus. Bar diagrams of looping interactions involving the +53, +40 and -31 elements determined by 3C in K562 and HPB-ALL cell lines. (A) Interactions between +53 (bait) and +40 and -31. (B) Interactions between -31 (bait) and +53 and +40. (C) Interactions between +40 (bait) and +53 and -31. Interaction frequencies (black bars), as measured by relative ligation frequencies, are shown with standard errors and normalized relative to BAC controls. Locations of 3C “bait” regions are denoted by vertical red arrows. Locations of genes at the TAL1 locus and their directions of transcription are shown at the top of the figure. p values are indicated for interaction frequencies which are significantly higher for test regions when compared to those of control regions. Scales (in kb) are shown at the bottom of the figure. p < 0.0001 = ****. Whilst +53 showed an elevated interaction frequency with +40 in K562 (Supplementary Figure S9a), we did not consider this interaction to be biologically relevant because: (i) the +46 control region located between +53 and +40 also showed even higher ligation frequencies suggesting that random ligation events accounted for the data between +53 and +40, (ii) the interaction could not be validated when +40 was used as a bait in 3C (Supplementary Figure S9c); (iii) there was no significant difference in the levels of this interaction in K562 and HPB-ALL cells which would suggest transcriptional-dependence (Supplementary Figure S10). Supplementary Figure S10. Comparisons of looping interactions between CTCF/RAD21-bound elements in K562 and HPB-ALL cells normalized against ERCC ligation frequencies. (A) Bar diagram showing interaction frequencies between +53 (bait), +40 and -31. (B) Bar diagram showing interaction frequencies between -31 (bait), +53 and +40. (C) Bar diagram showing interaction frequencies between +40 (bait), +53 and -31. Locations of 3C “bait” regions are denoted by vertical red arrows. p values are indicated for interactions which are significantly higher in K562 (TAL1 expressing) cells. p < 0.0001 = ****; p <0.00001 = *****. Supplementary Figure S11. Evolutionary conservation of CTCF motifs at CTCF and RAD21 bound elements at the TAL1 locus. (A) Schematic diagram of the TAL1 locus. Locations and directions of transcription of TAL1 and its flanking genes PDZK1IP1, STIL and CMPK1 are shown by the horizontal blue arrows. Locations of their cognate promoters and the TAL1 enhancers (+51, +19 → +21, -10) are shown with vertical red or green arrows respectively. CTCF and RAD21 bound elements studied here (+57, +53, +40 and -31) are shown with vertical blue arrows. Scale (in kb) is shown. (B) CTCF motifs at +57, +53, +40 and -31 are shown. Two CTCF motifs (italics) were identified at the +57 element, while a single motif was identified at each of +53, +40 and -31. (C) The ENCODE project12 data tracks showing ChIP-seq data for CTCF (wgEncodeEH002797) and RAD21 (wgEncodeEH000649) at +57, +53, +40 and -31 were obtained from public ENCODE data released from the Snyder (Stanford) laboratory. Scales and genome co-ordinates for human chromosome 1 in bp (hg19) are also shown. CTCF motifs at these elements align to peaks of CTCF binding and show strong similarity to the canonical CTCF motif13 (shown at the bottom of each panel). CTCF motifs at these elements are conserved across species at the DNA sequence level. Sequence conservation across five species (human, mouse, rat, dog and chicken) is shown for each CTCF motif. When compared to the motifs at +57, +53, and +40, the CTCF motif at -31 was the most highly conserved through evolution. (D) Composite showing alignment of all five composite CTCF motifs found at the +57, +53, +40 and -31 elements. The consensus 20 bp CTCF motif13 is also shown. Sequence differences between CTCF motifs at -31, when compared to those at +57, +53 and +40, are highlighted with the red arrows. Supplementary Figure S12. Occupancies of CTCF and RAD21 at insulator elements at the TAL1 locus. (A) Occupancies for CTCF and RAD21 at the +57, +40 and -31 elements in K562 cells. (B) Occupancies for CTCF and RAD21 at the +57, +40 and -31 elements in HPB-ALL cells. ChIP enrichments (log2) are shown with standard errors. Annotation of test and negative control regions is denoted in black and grey text respectively. Positive control is a CTCF/RAD21-bound element at the HNF4A locus. The green arrow highlights the lower levels of RAD21 at the -31 element in HPB-ALL cells. In HPB-ALL, the -31 element does not participate in looping with other CTCF/RAD21-bound elements at the TAL1 locus (see Figure 4 and Supplementary Figures S9 and S10). Supplementary Figure S13. Schematic flow diagram of the 4C-microarray procedure. 4C uses formaldehyde cross-linking (A) to covalently fix interacting chromatin segments to proteins at their sites of occupancy within nuclei of living cells. Subsequently, the crosslinked chromatin is digested using an appropriate restriction endonuclease (B), followed by intramolecular ligation of cross-linked chromatin fragments (C). The resulting 3C library contains a large number of ligation products, a proportion of which reflect interactions between non co-linear genomic regions which lie in close proximity within the nucleus; these include products containing the “bait” (i.e., the region of interest) ligated to a range of interacting DNA “prey” fragments. A → C are steps also used in 3C (see Supplementary Figure S2). The 3C library is then subjected to sonication (D) which reduces the average size of ligated fragments, thus avoiding incomplete primer extension in the following step. Primer extension (E) is with a 5’-biotinylated primer complementary to the “bait” sequence. The fragments containing the “bait” sequence after primer extension are isolated from the pool of 3C DNA using streptavidin beads, followed by blunt-ending and blunt adapter ligation (F). PCR amplification of fragments containing the “bait” sequence are generated using a nested primer complementary to the “bait” in combination with a nested adapter primer. The resultant products are the “bait”-specific 4C library (G). The 4C DNA is then fluorescently labelled and hybridised onto a TAL1 genomic tiling path microarray2 in a competitive hybridization with fluorescently labelled total genomic DNA from the cell type in question (H). Array information is obtained and quantified as previously described.2 Supplementary Figure S14. 4C interaction patterns obtained across the TAL1 locus using TAL1 promoter 1b as the “bait”. (A) K562. (B) HPB-ALL. Y axes in A and B show the frequencies of interactions expressed as a proportion of the “bait” signals for each microarray tile. X axis shows location of each microarray tile across the TAL1 locus and its flanking genes. (C) Organization of the human TAL1 locus. The location of Csp6I sites are shown by black bars. The scale is genome co-ordinates (bp) for human chromosome 1 (hg.17). Gene names are annotated in black. Exon-intron structures of genes are shown as joined up blue bars. Directions of transcription of genes are shown as black arrows. The location of all promoter, enhancer and CTCF/RAD21 elements at the TAL1 locus previously described2 are shown at the bottom of panel C. The location of the TAL1 promoter 1b “bait” is shown by the red line. Supplementary Figure S15. Looping interactions at the TAL1 locus relevant to T-ALL biology. (A) Bar diagrams of interaction patterns between the TAL1 promoters and the 81/TALd breakpoint region (intron 1 of STIL) in K562, HPB-ALL and Jurkat cells determined by 3C. (B) Bar diagrams of interaction patterns across the TAL1 locus in Jurkat cells determined by 3C. In both A and B, interactions, measured as relative ligation frequencies (black bars), are shown with standard errors. Location of 3C “bait” region (TAL1 promoter 1b = PTAL1) is shown in each panel (vertical red arrows). p values are indicated for relative ligation frequencies which are significantly higher for test regions when compared to those of control regions (controls defined as regions located between the “bait” and test regions). Scales (in kb) are shown at the bottom of the panels. (C). Comparison of interaction patterns at the TAL1 locus in K562 and Jurkat cells normalized against ERCC ligation frequencies. The location of the Jurkat -7 enhancer4 approx. 500 bp downstream of -8 is shown. p values are indicated for interactions which are significantly higher in Jurkat cell. p < 0.01 = **; p < 0.001 = ***; p < 0.0001 = ****. Supplementary Figure S16. Spatial interactions between the TAL1 active hub and deletion breakpoints found in T-ALL. (A) Schematic diagram shows the location of a common breakpoint in intron 1 of the STIL gene (TALd) which becomes juxtaposed to either one of four sites in the 5’-proximal portion of the TAL1 gene in T-ALL patients (TALd1 → d4 14 ). Breakpoints are shown as the red arrowheads. Black bar is the genomic region spanning the TAL1 and STIL genes while the dotted region represents the approximate size of T-ALL STIL/TAL1 deletions. The TALd1 breakpoint occurs close to the 5’ boundary of TAL1 promoter 1b. The schematic organization of the TAL1 gene is also shown. TAL1 exons lying adjacent to all three TAL1 promoters are shown as green boxes; other TAL1 exons are shown as red boxes. The STIL promoter (PSTIL) is also shown. (B) Looping interactions between TAL1 promoter 1b and intron 1 of the STIL gene occur at the sites of deletion breakpoints in T-ALL. The schematic diagram shows the organization of the TAL1 and STIL genes (scale and co-ordinates according to hg17) and the genomic location of the two microarray tiles which detected signals for the 4C “bait” sequence (containing TAL1 promoter 1b; denoted as Tile TAL1 P1B) and its interacting “prey” sequence within intron 1 of the STIL gene (denoted as Tile STIL +1). The location of each of the two T-ALL breakpoints (TALd1 and TALd) within each tile is also shown. The length of each microarray tile (in DNA bp) is also shown. Supplementary Figure S17. Cis-regulatory remodelling of vertebrate TAL1 loci during evolution. Left of the schematic shows the evolutionary tree and divergence of TAL1 across more than 360 million years of vertebrate evolution. Right of the schematic shows the organization of the TAL1 (green) and PDZK1IP1 (black) genes with respect to the TAL1 promoters and the +51 erythroid and +19/+20/+21 stem cell enhancers. Ets, GATA and Ebox DNA sequence motifs which are evolutionarily conserved within these regulatory elements are also shown. The “switch”15 in the stem cell enhancer motif from a GATA/E-box (frog and chicken) to a GATA/Ets box (mammals) is shown. While the protein-coding content of the TAL1 regulon has remained unchanged throughout 360 million years of vertebrate evolution,15 the organization of its cis-regulatory circuitry has shown evidence of evolutionary remodelling. Despite this, vertebrate patterns of TAL1 expression have remained highly conserved,16-21 suggesting that mechanisms which circumvent remodelling may facilitate TAL1 function to be preserved across species. The TAL1 hubs we describe here may provide answers to this question. Given that loss of a single GATA factor is sufficient to abrogate chromatin looping and disassemble the TAL1 active hub, all that may be required for hub formation are GATA factors bound at evolutionarily conserved GATA motifs. Such motifs are present at TAL1 promoters and enhancers throughout evolution. Thus, alterations in the composition of other transcription factor motifs (eg. Ets or E-box motifs) at TAL1 cis-regulatory elements through vertebrate evolution15 may not be problematic for hub assembly and co-ordination of TAL1 transcription. Supplementary Figure S18. Models of STIL, CMPK1 and PDZK1IP1 transcription dependence on the TAL1 active hub. (A) Linear schematic diagram showing the organization of the human TAL1 locus. Details are as described in Figure 1. (B) The recruitment model. The proximity of the STIL, CMPK1 and PDZK1IP1 promoters to the TAL1 active hub favours the recruitment of Pol II (shown in blue) and other factors to their respective promoters (shown by blue arrows connecting the hub to the promoters) in a hubdependent step (i). Transcription can then occur from these promoters in a hub-independent manner (ii). In this model, loss of this proximity between the TAL1 promoters and the flanking genes would result in a decrease of recruitment of these factors to the relevant promoters (as we observed for STIL). However, our Pol II occupancy data does not support this model for either PDZK1IP1 or CMPK1. (C) Direct interaction model. The promoters of STIL, CMPK1 and PDZK1IP1 engage directly with the Pol II machinery within the TAL1 active hub which is entirely hub-dependent at all stages of transcription. Transcription is facilitated by the movement of chromatin through the hub with loops becoming large or smaller accordingly (steps i to iv shown in this figure with respect to STIL transcription – however the same could apply for CMPK1 and PDZK1IP1 depending on the direction that DNA within the loops traverse through the hub). This model is compatible with the data that both PDZK1IP1 and STIL show contact with the hub at various points within their gene bodies. For both models presented in B and C, the production of a full-length STIL mRNA would require the transient removal of CTCF and RAD21 from the -31 element [B(ii) and C(iv)]. Consistent with this, we demonstrated that CTCF and RAD21 binding at this element is dynamic (Figure 5). Locations of promoters, enhancers, CTCF/RAD21 elements, direction of transcription of relevant genes (grey arrows), GATA1, TEC and Pol II recruitment, and CTCF/RAD21 occupancies are also shown as in Figures 3, 5 and 7. Note: The TAL1 -10 enhancer has not been shown to be contact with the hub for simplicity of the models shown in this figure. This interaction, however, is shown in the erythroid model presented in Figure 7. GENE FORWARD 5' → 3' REVERSE 5' → 3' ACTB AGAAGGAGATCACTGCCCTGG CACATCTGCTGGAAGGTGGAC TUBB GCAGATGCTTAACGTGCAGA CAATGAAGGTGACTGCCATC GAPDH AGGTCCACCACTGACACGTTG AGCTGAACGGGAAGCTCACT TAL1 TTTTGTGAAGACGGCACGG TGAGAGCTGACAACCCCAGG PDZK1IP1 TTGCAATCGCCTTTGCAGTC TCCATCTGCCTTGTTTCCGA STIL ATGCACATAACGTGGATCACG TCCATGCTCAAATCCACACC CMPK1 TCTCATGAAGCCGCTGGT TCCTGCAGAAAGGTGTGTGT GATA1 CAAGCTACACCAGGTGAACCG AGCTGGTCCTTCGGCTGC LDB1 CCAGCTAGCACCTTCGCC GTCGTCAATGCCGTTGGC TCF3 AGGTGCTGTCCCTGGAGGAG CCGACTTGAGGTGCATCTGG GATA2 ATCAAGCCCAAGCGAAGACT CATGGTCAGTGGCCTGTTAAC Supplementary Table S1. Oligonucleotide primer pairs used to determine the expression levels of gene transcripts using SyBr green-based quantitative PCR. First column shows the gene name. Second and third columns show the DNA sequences for the forward and reverse primers respectively. Protein Epitope Source Catalogue No. TCF3 (E47) E47 (N-649) Santa Cruz Biotechnology sc-763 LDB1 CLIM-2 (N-18) Santa Cruz Botechnology sc-11198 GATA1 GATA1 (M-20) Santa Cruz Biotechnology sc-1234 CTCF CTCF (C-20) Santa Cruz Biotechnology sc-15914 RAD21 Anti-RAD21 Abcam ab992 RNA pol II Anti-RNA pol II Abcam ab5408 Supplementary Table S2. Antibodies used for chromatin immunoprecipitation (ChIP). First column shows the protein and isoform name. Second column is the epitope to which the antibody is raised. Third and fourth columns are the commercial source and catalogue number for each antibody respectively. GENE ASSAY FEATURE FORWARD 5' → 3' REVERSE 5' → 3' TAL1 TAL1 +137 neg. control TTTGCAGTGCCCTGTTCTTAG TGTTGGCTACCTTGATCATGTG TAL1 +57 insulator CTGCAATATCTCGAGCAGCCAC GAACAACACGGGCATGGAGATG TAL1 +51 erythroid enhancer TGACCTTACAGCCCTTCACCC AGCTCCCTGCTCCCAGCAC TAL1 +40 insulator GTCAATGTCCACCGTCCCTTTC GGAGCCAGTTTGCTGCTGAAG TAL1 +32 neg. control GGATTGAGGAGAGGGCATGTG GCACGGCTGTGGAGCTATG TAL1 +20 stem cell enhancer TTCGAACGGATCACATCCTG TTGGTCCGAGCTCTGCCTC promoter 1a CGCCGCAGAGATAAGGCACT CCCACTCCCTCCGGTGAAAT TAL1 -28 neg. control TGTCACGCAGGATATAGTGGCA TTAGGAGGCTGAAGTAGGAGGAC TAL1 -30 neg. control GTGCCCTTGAGAGCCTAGGG CCTCAACAGCCTGTCTTATAATTG TAL1 -31 insulator CAACCAGGTGCTGCTTGAGTC GAGAAGAGCTGCTGGGAAGG TAL1 -35 neg. control TGGTAACCTGGGAACAAGGTGT ACTGGCTCCTTCTCATCATTCAGG TAL1 -37 neg. control CCACTGTGCCCAGCCTATTT GTGAGCCAAGACAGTGCCATT TAL1 -94 neg. control CAGGGTATATCTATGTTCCTAGCAC GATTGATGAATGGTGACAAAGC P TAL1 CMPK1 P CMPK1 promoter GCGCAGAGGTTAGCGTGTC GCCTCTAACCCAAATCCGC STIL P STIL promoter GCTCCTACCCTGCAAACAGAC GGAAACCAGGAGCACAAAGC TBP P TBP promoter (pos. control) GACCTATGCTCACACTTCTCATGG CGTTGATAATGTCACTTCCGCCAG HNF4A HNF4A CTCF/Rad21 (pos. control) GATTATCACACCTTGAGGGTAGGG ACTGTCCTGTACATTGTCCCTG Supplementary Table S3. Oligonucleotide primer pairs used to determine chromatin immunoprecipitation (ChIP) enrichment levels across the TAL1 locus. First column shows the gene name. Second column shows the region assayed relative to the gene locus (numerical designations refer to distance in kb from the relevant gene promoter; - = upstream from promoter, + = downstream from promoter). Third column describes the function of the element assayed. Fourth and fifth columns show the DNA sequences for the forward and reverse primers respectively. BAIT P PREY TAL1 SPECIES BAIT PRIMER 5' → 3' human CTCTGTGTCCGAGTGTGGTG PREY PRIMER 5' → 3' TAL1 +64 TCTTCCTAGCCTCGATGGTC TAL 1 +51 CGCAGAAAAGCAAGGATAGG TAL 1 +46 GTGAGAACCAGGACCCAGAA TAL 1 +19 CCCACAATGGAGAGGATGAC TAL 1 +15 AGCCTGAGTGCTACAAAGGT TAL1 -8 GCGTGAAAGTCAACCATGTG TAL1 -10 CCTGAACCAGGAGTTTGTCAC TAL1 -25 TGGCAAGTAGGCTGGAACTT TAL1 -31 GTTACTGGCACCCCCTGTT TAL1 -41 AGTGGAAGAGCCTCCCTTTG TAL1 -72 GGTGATCCACCTGCCTCAT TAL1 -81 ATGCTCGCTCTTGCATTCCT TAL1 -85 TGCAAAGGCCCTGAGTTACA TAL1 +57 human GGCAACCATGGGTCTAAAGCAT TAL1 +46 GTGAGAACCAGGACCCAGAA TAL1 +40 GAAACCTGGGAGTCACCTGAA TAL1 +30 TTACAGACGCATGCCACCTC TAL1 -25 TGGCAAGTAGGCTGGAACTT TAL1 -31 GTTACTGGCACCCCCTGTT TAL1 -41 AGTGGAAGAGCCTCCCTTTG TAL1 +53 human TGGGAAGAAATGGCATCTACGC TAL1 +46 GTGAGAACCAGGACCCAGAA TAL1 +40 GAAACCTGGGAGTCACCTGAA TAL1 +30 TTACAGACGCATGCCACCTC TAL1 -25 TGGCAAGTAGGCTGGAACTT TAL1 -31 GTTACTGGCACCCCCTGTT TAL1 -41 AGTGGAAGAGCCTCCCTTTG TAL1 +40 human TAL1 +64 TCTTCCTAGCCTCGATGGTC TAL1 +53 TGGGAAGAAATGGCATCTACGC TAL1 +46 GTGAGAACCAGGACCCAGAA TAL1 -25 TGGCAAGTAGGCTGGAACTT TAL1 -31 GTTACTGGCACCCCCTGTT TAL1 -41 AGTGGAAGAGCCTCCCTTTG TAL1 -31 human TCTTCCTAGCCTCGATGGTC TAL1 +53 TGGGAAGAAATGGCATCTACGC TAL1 +46 GTGAGAACCAGGACCCAGAA TAL1 +40 GAAACCTGGGAGTCACCTGAA TAL1 +30 TTACAGACGCATGCCACCTC human CGCAGAAAAGCAAGGATAGG TAL1 +30 TTACAGACGCATGCCACCTC TAL1 +19 CCCACAATGGAGAGGATGAC TAL1 +15 AGCCTGAGTGCTACAAAGGT ERCC3 TAL1 Ercc3 GTTACTGGCACCCCCTGTT TAL1 +64 TAL1 +51 P GAAACCTGGGAGTCACCTGAA human CCCTGGACATGTCGGAAA mouse TGCCCCTTAAGCTTGGTTTC AGGGGTTTGCTCTTTGAGGT TAL1 +55 TGGGAACAGATTGTGGGACT TAL1 +40 TGCTGGCTTCCTCTCTTTTC TAL1 +30 AAAAGCCTCTCCCTCTCCAG TAL1 +18 CCTAGATGAGGGGTGAGAGC TAL1 +15 AGCCTTTCCCCTTGATGTTC TAL1 -5 CGACCTTCCCTACGTCTTTG TAL1 -9 GAGAACAGATGGGCTTGGTC mouse AACGGACAGCTTTAGGCAGA TGGCTGTAGTTGTGCCTTCTC Supplementary Table S4. Oligonucleotide primer pairs used for 3C analysis in human and mouse cells. First column shows the 3C “bait” region and gene locus from which it is derived. Naming system is as per Supplemental Table 3. Second column shows the “prey” region used in 3C primer combinations with the “bait”. Third column is the species in which the assays were performed. The fourth and fifth columns are the “bait” and “prey” primer sequences respectively. Primer Name Sequence 5' → 3' PTAL1-1b (primer extension) biot-GGCGGCGTTGGCTGCTTCTAAGTG PTAL1-1b (nested PCR primer) GACAGGCTCTGTGTCCGAGT Blunt-ended adapter (forward) ACAGGTTCAGAGTTCTACAGTCCGAC Blunt-ended adapter (reverse) p-GTCGGACTGTAGAACTCTGAAC Adapter PCR primer GGTTCAGAGTTCTACAGTCCGAC Supplementary Table S5. Oligonucleotide primers used for 4C sample preparation. First column shows the primer name and its use in constructing the 4C library. Second column shows the primer sequence. Biotinylation = biot, p = 5’ phosphate. REFERENCES 1. Fujiwara T, O'Geen H, Keles S, et al. Discovering hematopoietic mechanisms through genome-wide analysis of GATA factor chromatin occupancy. Mol Cell. 2009;36(4):667-681. 2. Dhami P, Bruce AW, Jim JH, et al. Genomic approaches uncover increasing complexities in the regulatory landscape at the human SCL (TAL1) locus. PLoS One. 2010;5(2):e9059. Prepublished on 2010/02/09 as DOI 10.1371/journal.pone.0009059. 3. Dhami P. The SCL gene and transcriptional control of haematpoiesis. PhD thesis, University of Cambridge, United Kingdom. 2005. 4. Sanda T, Lawton LN, Barrasa MI, et al. Core transcriptional regulatory circuit controlled by the TAL1 complex in human T cell acute lymphoblastic leukemia. Cancer Cell. 2012;22(2):209-221. Prepublished on 2012/08/18 as DOI S1535-6108(12)00256-5 [pii] 10.1016/j.ccr.2012.06.007. 5. Gottgens B, Nastos A, Kinston S, et al. Establishing the transcriptional programme for blood: the SCL stem cell enhancer is regulated by a multiprotein complex containing Ets and GATA factors. Embo J. 2002;21(12):3039-3050. 6. Gottgens B, Broccardo C, Sanchez MJ, et al. The scl +18/19 stem cell enhancer is not required for hematopoiesis: identification of a 5' bifunctional hematopoietic-endothelial enhancer bound by Fli-1 and Elf-1. Mol Cell Biol. 2004;24(5):1870-1883. 7. Ogilvy S, Ferreira R, Piltz SG, Bowen JM, Gottgens B, Green AR. The SCL +40 enhancer targets the midbrain together with primitive and definitive hematopoiesis and is regulated by SCL and GATA proteins. Mol Cell Biol. 2007;27(20):7206-7219. Prepublished on 2007/08/22 as DOI MCB.00931-07 [pii] 10.1128/MCB.00931-07. 8. Follows GA, Ferreira R, Janes ME, et al. Mapping and functional characterisation of a CTCF-dependent insulator element at the 3' border of the murine Scl transcriptional domain. PLoS One. 2012;7(3):e31484. Prepublished on 2012/03/08 as DOI 10.1371/journal.pone.0031484 PONE-D-11-12685 [pii]. 9. Kurukuti S, Tiwari VK, Tavoosidana G, et al. CTCF binding at the H19 imprinting control region mediates maternally inherited higher-order chromatin conformation to restrict enhancer access to Igf2. Proc Natl Acad Sci U S A. 2006;103(28):10684-10689. Prepublished on 2006/07/04 as DOI 0600326103 [pii] 10.1073/pnas.0600326103. 10. Zhao H, Dean A. An insulator blocks spreading of histone acetylation and interferes with RNA polymerase II transfer between an enhancer and gene. Nucleic Acids Res. 2004;32(16):4903-4919. 11. Zhou Y. Transcriptional regulation of the stem cell leukaemia gene (SCL/TAL1) via chromatin looping. PhD thesis, University of Cambridge, United Kingdom. 2013. 12. A user's guide to the encyclopedia of DNA elements (ENCODE). PLoS Biol. 2011;9(4):e1001046. Prepublished on 2011/04/29 as DOI 10.1371/journal.pbio.1001046. 13. Essien K, Vigneau S, Apreleva S, Singh LN, Bartolomei MS, Hannenhalli S. CTCF binding site classes exhibit distinct evolutionary, genomic, epigenomic and transcriptomic features. Genome Biol. 2009;10(11):R131. Prepublished on 2009/11/20 as DOI gb-2009-1011-r131 [pii] 10.1186/gb-2009-10-11-r131. 14. Breit TM, Mol EJ, Wolvers-Tettero IL, Ludwig WD, van Wering ER, van Dongen JJ. Site-specific deletions involving the tal-1 and sil genes are restricted to cells of the T cell receptor alpha/beta lineage: T cell receptor delta gene deletion mechanism affects multiple genes. J Exp Med. 1993;177(4):965-977. Prepublished on 1993/04/01 as DOI. 15. Gottgens B, Ferreira R, Sanchez MJ, et al. cis-Regulatory remodeling of the SCL locus during vertebrate evolution. Mol Cell Biol. 2010;30(24):5741-5751. Prepublished on 2010/10/20 as DOI MCB.00870-10 [pii] 10.1128/MCB.00870-10. 16. Gottgens B, Barton LM, Gilbert JG, et al. Analysis of vertebrate SCL loci identifies conserved enhancers. Nat Biotechnol. 2000;18(2):181-186. 17. Green AR, Lints T, Visvader J, Harvey R, Begley CG. SCL is coexpressed with GATA-1 in hemopoietic cells but is also expressed in developing brain. Oncogene. 1992;7(4):653-660. 18. Jaffredo T, Bollerot K, Sugiyama D, Gautier R, Drevon C. Tracing the hemangioblast during embryogenesis: developmental relationships between endothelial and hematopoietic cells. Int J Dev Biol. 2005;49(2-3):269-277. Prepublished on 2005/05/21 as DOI 041948tj [pii] 10.1387/ijdb.041948tj. 19. Mead PE, Kelley CM, Hahn PS, Piedad O, Zon LI. SCL specifies hematopoietic mesoderm in Xenopus embryos. Development. 1998;125(14):2611-2620. 20. Sinclair AM, Gottgens B, Barton LM, et al. Distinct 5' SCL enhancers direct transcription to developing brain, spinal cord, and endothelium: neural expression is mediated by GATA factor binding sites. Dev Biol. 1999;209(1):128-142. 21. Zhang XY, Rodaway AR. SCL-GFP transgenic zebrafish: in vivo imaging of blood and endothelial development and identification of the initial site of definitive hematopoiesis. Dev Biol. 2007;307(2):179-194. Prepublished on 2007/06/15 as DOI S0012-1606(07)00737-3 [pii] 10.1016/j.ydbio.2007.04.002.
© Copyright 2026 Paperzz