Integration Host Factor: Putting a Twist on Protein–DNA Recognition

doi:10.1016/S0022-2836(03)00529-1
J. Mol. Biol. (2003) 330, 493–502
Integration Host Factor: Putting a Twist on
Protein –DNA Recognition
Thomas W. Lynch1, Erik K. Read2, Aras N. Mattis3, Jeffrey F. Gardner2
and Phoebe A. Rice1*
1
Department of Biochemistry
and Molecular Biology, The
University of Chicago, 920 E
58th Street CLSC 221, Chicago
IL 60637, USA
2
Department of Microbiology
University of Illinois, Urbana
IL 61801, USA
3
Department of Biochemistry
University of Illinois, Urbana
IL 61801, USA
*Corresponding author
Integration host factor (IHF) is a DNA –bending protein that recognizes its
cognate sites through indirect readout. Previous studies have shown that
binding of wild-type (WT)-IHF is disrupted by a T to A mutation at the
center position of a conserved TTR motif in its binding site, and that substitution of bGlu44 with Ala prevented IHF from discriminating between
A and T at this position. We have determined the crystal structures and
relative binding affinities for all combinations of WT-IHF and IHFbGlu44Ala bound to the WT and mutant DNAs. Comparison of these
structures reveals that DNA twist plays a major role in DNA recognition
by IHF, and that this geometric parameter is dependent on the dinucleotide step and not on the bound IHF variant.
q 2003 Elsevier Science Ltd. All rights reserved
Keywords: indirect readout; X-ray crystallography; gel-shift assay; mutants;
protein –DNA interactions
Introduction
Mechanisms of protein– DNA recognition can be
divided into two general categories: direct readout,
where sequences are distinguished through the
unique functional groups of the DNA bases in the
major groove; and indirect readout, where the
protein relies on sequence-dependent structural
features of the DNA, such as backbone conformation and flexibility. Despite the wide collection
of protein– nucleic acid systems that utilize indirect
readout, at least in part, this type of recognition is
not understood as clearly as its mechanistic
counterpart.
One example of a protein that relies entirely on
indirect readout during DNA recognition is integration host factor (IHF) from Escherichia coli. IHF
is a small (, 20 kDa) heterodimeric protein that
binds DNA in a sequence-specific manner and
induces a large bend (. 1608). This bending aids in
the formation of higher-order structures in such
processes as recombination, transposition, replication, and transcription.1 – 3 Although IHF –DNA
Present address: E. K. Read, Laboratory of Molecular
Genetics, National Institute of Child Health and Human
Development, Bethesda, MD 20892, USA.
Abbreviations used: IHF, integration host factor; WT,
wild-type; R, purine; Y, pyrimidine.
E-mail address of the corresponding author:
[email protected]
contacts extend over , 35 bp, only a subset of
these bases are conserved significantly among
known IHF binding sites. The most conserved
bases cluster in two conserved elements: a
WATCAR element (where W ¼ A/T and
R ¼ A/G) and a second element, TTR, 4 bp to the
30 side. Some IHF binding sites also contain a
4– 6 bp A/T-rich segment or poly(dA) tract
approximately 8 bp to the 50 side of the WATCAR
element.3 – 6 The equilibrium dissociation constant
(Kd) of IHF for the H0 site of bacteriophage l, one
of the best-characterized IHF binding sites, is
, 1029 M and it prefers this site by a factor of
103 – 104 over random sequences.7,8
The original crystal structure of IHF bound to a
35 bp duplex DNA containing the H0 site clearly
illustrates that the protein– DNA contacts are
made to the phosphate backbone and the minor
groove only (Figure 1(a) and (b)).9 In fact, only
three protein side-chains form hydrogen bonds
with the DNA bases, and these are all in the
minor groove at positions where all four bases
display similar hydrogen bond acceptors. It is
therefore clear that IHF recognizes its cognate sites
through the sequence-dependent structure and
flexibility of the DNA rather than through direct
readout. Although several hypotheses have been
proposed to explain the mechanisms of this
recognition, they remain largely untested.
We now have examined in more detail the interactions between IHF and the TTR element of the
0022-2836/03/$ - see front matter q 2003 Elsevier Science Ltd. All rights reserved
494
Integration Host Factor
Figure 1. Structure of the IHF– H0 DNA complex. (a) Ribbon view of the overall X-ray structure with the a subunit in
grey, the b subunit in pink, the consensus sequence DNA bases in green and the less conserved bases in blue. (b) Stereo
view of the contacts between IHF and the TTR element of the H0 site. (c) The duplex DNAs used for crystal growth and
binding analysis. The numbering corresponds to bases 19 – 47 of bacteriophage l, and the position of the nick needed
for crystallization is marked by the arrow.
H0 binding site (base-pairs 43 – 45 of phage l),
shown in Figure 1(b) and (c). The side of the
protein forms a “clamp” across the minor groove
of this element, with the N termini of two a helices
contacting the flanking phosphate groups. This
clamp imposes a high overall twist and a narrow
minor groove. bArg46 makes the lone direct con-
tact to the bases of this element with a hydrogen
bond to O2 of T 44. bArg46 is held in place by a
chain of salt-bridges (Figure 1(b)). The position of
its guanidinium group would create a steric clash
with the protruding amino group of a GC basepair at this position, but would not by itself discriminate against a T to A transition, since either
495
Integration Host Factor
one can accept a hydrogen bond with similar
geometry. Consistent with the finding that pyrimidine – purine (Y – R) steps are more flexible than
other dinucleotide steps, the helical twist between
TA 44 and GC 45 is 47.68.9,10 It was proposed that
this interaction along with the helical twist at
base-pair step 44– 45 are important factors for IHF
to recognize the binding site.
Earlier work by Gardner and co-workers confirmed the validity of the consensus sequence in a
genetic study wherein base-pair substitutions that
disrupt IHF binding were isolated within each of
the three elements of the H0 site.11 One of these
mutations, a T to A switch at the center position
of the TTR element (H0 44A), was used in a subsequent selection to find variant IHF proteins that
display relaxed specificity.12,13 Substitution of
bGlu44 with Ala or several other amino acids prevented IHF from discriminating against the H0 44A
site, but not against other variations in the TTR
element.14 Interestingly, bGlu44 does not contact
the DNA itself, but is a critical part of the saltbridge chain that positions bArg46 (Figure 1(b)).
With a combination of mutagenesis, detailed
binding analyses, and crystallographic studies we
sought to enhance our understanding of the
indirect readout process using the IHF – DNA complex as a model system. To achieve this goal, we
have determined the structures and relative binding affinities for all combinations of WT and
bGlu44Ala IHF bound to WT and T44A H0 sites.
This work clearly demonstrates the importance of
sequence-dependent DNA structural variations in
recognition of target sites by IHF.
previous studies (Table 1).18,19 Under these conditions, the WT protein binds the mutant H0 44A
sequence two orders of magnitude less tightly.
IHF-bGlu44Ala binds H0 DNA with slightly
diminished affinity, compared to WT-IHF (, 5 £
lower), but fails to discriminate against the mutant
DNA sequence. IHF-bGlu44Ala binds both H0 and
H0 44A DNA sequences with nearly equal affinity,
which is in agreement with challenge phage
assays.14 The presence of a nick in the substrate
DNA had little effect on the relative binding
affinity compared to an intact DNA substrate.
Structures of the variant IHF-H0 complexes
The three variant complexes crystallized isomorphously with the original WT complex
(Table 2). Significant differences among the four
structures are confined to the vicinity of the TTR
element (Figure 2). The variations appear to be
driven by changes in the DNA sequence rather
than changes in the protein sequence (Figure 3(a)
and (b)). Regardless of which protein is bound,
the T to A base substitution alters the DNA conformation (described below). In both complexes
involving WT DNA, bArg46 contacts O2 of T44,
and removal of the b44 carboxylate group makes
little difference structurally. One might have
expected little change in the hydrogen bonding to
the mutant DNA, since N3 of adenine can accept a
Table 2. Data collection and refinement statistics
Data set
Results and Discussion
Relative binding affinities of each complex
Apparent equilibrium dissociation constants (Kd)
of the four protein– DNA complexes presented
here were determined by electrophoretic mobilityshift assay (Table 1).15,16 These assays used DNA
duplexes of the same length and sequence as
those used for structure determinations. All binding experiments were repeated with and without
the nick in the DNA that is required for crystal formation. This nick has been shown to have a
minimal effect on the binding of WT-IHF to H0
DNA,17 but in vitro binding analyses of the suboptimal mutants had not been carried out. The Kd
of 2.0(^ 0.5) nM we obtained for the WT-IHF-H0
complex is in good agreement with the results of
Table 1. Gel mobility shift analysis—Kd (app)
Site
H0
H0 nicked
H0 44A
H0 44A nicked
WT-IHF
IHF-bGlu44Ala
2.0( ^ 0.50)
1.5( ^ 0.50)
200( ^ 20)
240( ^ 40)
4.6( ^ 1.1)
11( ^ 1.1)
15( ^ 0.9)
10( ^ 1.0)
IHFbGlu44Ala/
H0
Data collection statistics
Space group
P212121
Unit cell dimensions
a (Å)
46.65
b (Å)
60.85
c (Å)
180.44
a ¼ b ¼ g ð8Þ
90
Beamline/waveID19/0.9
length (Å)
96.9
Completeness
(%)a
Refinement statistics
Resolution
a (Å)
2.30
b (Å)
2.70
c (Å)
1.95
No. of water
165
molecules
23.6/27.5
Rcrysb/Rfreec (%)
Rmsd from ideal values
Bond lengths (Å)
0.005
Bond angles (8)
1.10
Ramachandran analysis (%)d
Favored
90.1
Allowed
9.9
a
WT-IHF/
H0 44A
IHFbGlu44Ala/
H0 44A
P212121
P212121
46.66
58.83
181.31
90
BMC-14/
0.9
93.2
46.80
60.11
181.18
90
BMC-14/0.9
2.75
3.20
2.10
83
2.65
3.10
2.40
203
23.0/26.2
24.1/27.1
0.006
1.07
0.003
0.69
90.1
9.9
85.8
14.2
90.5
Within ellipsoid.
P
P
Rcrys ¼ kFo l 2 lFc k= lFo l:
c
Rfree calculated with 5% of reflections that were not used for
refinement.
d
As defined by PROCHECK.40
b
496
Integration Host Factor
Figure 2. Simulated-annealing omit maps of the WT and three variant IHF/H0 complexes. Simulated-annealing omit
maps are superimposed on the final model for each structure. Residues bArg42, bGlu44/bAla44, bArg46, and basepair 44 were omitted. (a) WT-IHF/H0 ,9 contoured at 3.5s. (b) IHF-bGlu44Ala/H0 , contoured at 3.5s. (c) WT-IHF/
H0 44A, contoured at 3.5s. (d) IHF-bGlu44Ala/H0 44A, contoured at 2.2s.
hydrogen bond with nearly the same geometry as
O2 of thymine. However, because of DNA conformational changes, if bArg46 did not move, its
NH1 and NH2 atoms would be 3.7 Å and 3.6 Å
from A44 N3; versus 3.0 Å to O2 of T44 in the WT
DNA structures. In the mutant DNA structures,
bArg46 therefore shifts to form a new hydrogen
bond to T440 of the opposite strand (Figure 3(c)).
The new position of bArg46 could be influenced
by a minor hydrogen bonding preference for the
two lone pairs of O2 (versus the single lone pair of
N3) or more favorable torsion angles. It should be
noted that bArg46 could not occupy this position
when bound to H0 DNA, as a steric clash would
result from the proximity of the side-chain and
A44 to each other (1.9 Å). The change of position
of bArg46 when bound to mutant DNA disrupts
the chain of salt-bridges whether or not the carboxylate group of bGlu44 is present. Thus, the
salt-bridge involving bArg46 and bGlu44 is intact
only when WT protein is bound to WT DNA.
In all three new structures, bGlu44 and bArg42,
which with bArg46 form the chain of salt-bridges
shown in Figure 1(b), move as well (Figure 3). In
both structures involving the bGlu44Ala mutant,
the side-chain of bArg42 shifts: in the original
structure its tip is within 2.8 Å of the 50 phosphate
group of nucleotide 410 , whereas in the mutant protein it is closer (3.2 – 3.5 Å) to phosphate 400 . The
reasons for the shift, and its consequences, are not
entirely clear. The DNA backbone in this region
does not vary appreciably. When the WT protein
is bound to mutant DNA, the tip of bArg42 lies
between these two phosphate groups, but is closer
(3.2 Å) to phosphate 410 . It still forms a salt-bridge
to bGlu44 but this glutamate residue adopts a
different rotamer than when bound to WT DNA.
This new rotamer brings the carboxylate oxygen
atoms of bGlu44 to within 3.0 Å of the same phosphate group that bArg42 contacts.
Although a number of factors probably contribute to the two orders of magnitude loss in binding
affinity of the WT protein for T44A H0 DNA, we
propose that the primary factor is the disruption
of the salt-bridge between bArg46 and bGlu44.
This energetic penalty may reflect disruption of
the salt-bridge itself and electrostatic repulsion
between the glutamate residue and nearby
phosphate groups when it loses one of its flanking
arginine residues. The exact energetic cost of disrupting this salt-bridge is unclear, but, given that
it is partially shielded from bulk solvent by close
Integration Host Factor
497
Figure 3. Observed structural differences in the variant IHF/H0 complexes. (a) Stereo view of a difference electron
density map showing the differences between the fully WT structure and that of WT-IHF bound to H0 44A DNA.
After rigid body refinement of the fully WT structure against the mutant data, a (Fo(mut) 2 Fc(wt))F(wt) map was
contoured at 3.5s (green) and 2 3.5s (red), and superimposed on the fully WT model. The differences in DNA backbone between the two sequences and the repositioning bArg42, bGlu44, and bArg46 are clearly shown. (b) Stereo
representation of base-pairs 43 – 45 for all four complexes. The DNA structure is dependent on sequence and not on
the protein bound. WT-IHF/H0 is colored in pink, IHF-bGlu44Ala/H0 in purple, WT-IHF/H0 44A in green, and IHFbGlu44Ala/H0 44A in yellow. (c) An expanded view of the TTR element and residues bArg42, bGlu44/bAla44, and
bArg46 for both the WT-IHF/H0 (pink) and IHF-bGlu44Ala/H0 44A (yellow) complexes.
498
Integration Host Factor
Figure 4. Graphs of the helical DNA parameters. (a) The inter-base-pair twist for each of the four IHF/DNA
complexes, (B) WT-IHF/H0 ; (V)IHF-bGlu44Ala/H0 ; (X) WT-IHF/H0 44A; (O)IHF-bGlu44Ala/H0 44A. Large differences
in twist value are observed near the TTR region (base-pairs 43 – 45). (b) The twist value for base-pair steps 42 – 46.
(c) The roll value for base-pair steps 42 – 46. (d) The propeller twist value for base-pairs 41 – 47. The broken line in
each of the graphs represents the average value of the helical parameter for B-form DNA.
contact with the minor groove face of the DNA, it
may be significant. That the IHF-bGlu44Ala protein loses the ability to discriminate between WT
and mutant DNA primarily because this saltbridge is unable to form is supported by the
results of a challenge phage assay. IHF variants
with Asp at position b44 bound both DNA
sequences poorly but could still discriminate
between the two.14
Sequence-dependent structure of the DNA
One of the factors that was proposed to play an
important role in the recognition of the H0 site by
IHF is the helical twist within the TTR element
(base-pairs 43– 45), and indeed, this parameter
changes significantly when the DNA sequence is
changed from TTG (WT) to TAG (mutant)
(Figure 4(a)). The overall twist from base-pairs
43 –45 is constrained by contacts from the flanking
phosphate groups to the peptide backbone, and
changes by only , 38. However, this twist is
apportioned quite differently; in the two structures
involving WT DNA, the first step (TT) has a twist
of just over 348, close to the average for B-form
DNA (36.18),20 while the second step, TG, is highly
Integration Host Factor
overtwisted at , 488. When the H0 44A DNA is
bound by either protein, these values shift to , 408
and , 388 for the first and second steps,
respectively.
Other parameters such as roll and propeller
twist follow the same trend as twist: they correlate
with the DNA sequence rather than the sequence
of the bound protein (Figure 4(b) – (d)). In
agreement with previous observations,21 there is a
negative correlation between roll and twist: the
highly twisted TG step displays a negative roll
value, whereas the less twisted AG step in the
mutant DNAs has a positive roll. Interestingly,
the top-strand bases remain nearly coplanar in all
the DNA structures, while the bottom-strand
bases do not. A change in roll between two
base-pairs is thus coupled to the changes in propeller twist within those base-pairs. Since the packing
of all four complexes in the crystals is similar, the
differences in DNA structure cannot be artifacts of
the crystallization process. The data suggest that
the protein side-chains must shift to accommodate
the variations in the geometric parameters of the
DNA. While these geometric parameters are
clearly coupled to one another, if their effects are
considered seperately, changes in twist have
the largest effect on the relationship between the
minor groove face of base-pair 44 and the guanidinium group of bArg46. Our discussion of
recognition thus focuses on the changes in twist
because, in the absence of other changes, they
would still result in disruption of the base
44-bArg46 interaction, whereas the changes in
other parameters, by themselves, would not.
IHF and indirect readout
Relatively rigid small molecules that bind in the
minor groove can distinguish T from A nearly as
well as WT-IHF but without heavy reliance on
sequence context.22,23 These molecules rely on the
asymmetry of the small cleft between the A and T
(forming a contact to C2 of the A) and on the fact
that O2 of T has two lone pairs to accept hydrogen
bonds, whereas N3 of A has only one lone pair.
IHF clearly does not exploit the adenine C2 cleft,
but some specificity might be gained by the
bidentate nature of the arginine –thymine interaction. However, in the engrailed homeodomain –
DNA complex where similar Arg-T interactions
are seen, the binding constants for T to A
transitions differ by only , 5 –7-fold, significantly
less than the two orders of magnitude reported
here for IHF.24 In IHF, the arginine side-chain is
clearly quite flexible, and the DNA feature
recognized appears to be not O2 of T per se but
rather a particular sequence-dependent DNA
structure.
Several aspects of DNA structure appear to play
a role in indirect readout of DNA sequence by
IHF. Base-pairs 43– 45 need to be able to adopt a
structure with a narrow minor groove in order to
fit into a clamp between the N termini of two a
499
helices (Figure 1(a)). A/T-rich sequences tend to
favor narrow minor grooves, and this feature
appears to be recognized in many protein –DNA
complexes such as 434 repressor25 and MetJ26. In
fact, many IHF binding sites carry a TTA rather
than a TTG sequence at these positions. The
presence of bArg46 may select against G/C basepairs at position 44 by clashing with the amino
group of the G that protrudes into the minor
groove. A requirement for general A/T-richness,
however, does not explain how WT-IHF discriminates between A and T at position 44, and the
similar affinities for complexes of IHF-bGlu44Ala
with both TTG- and TAG-carrying DNAs show
that both sequences can comfortably fit into the
protein clamp. A second DNA structural requirement is enforced by the chain of charge:charge
and polar interactions extending from a DNA
phosphate group through bArg42, bGlu44, and
bArg46 and ending at O2 of T44 in the minor
groove. The importance of these residues in
sequence specificity is highlighted by the strong
conservation of the motif RxExR in IHFb subunits
from different bacteria. This motif is not found in
IHFa subunits, nor in HU, which is closely related
but binds DNA without significant sequence
specificity.
In order for this chain of salt-bridges and hydrogen bonds to form properly, the overtwisting
associated with the narrow minor groove must be
disproportionately accommodated at the second
T –R step. When the twist is reapportioned in the
TAG structures, the hydrogen bond acceptors displayed by base-pair 44 shift, and bArg46 moves to
form an alternative hydrogen bond. This in turn
disrupts the salt-bridge between bArg46 and
bGlu44. Although the bGlu44-bArg42-phosphate
portion of the salt-bridge chain can still form,
since bArg42 contacts two negatively charged
groups, the charge of the glutamate residue is no
longer fully countered. We expect that both disruption of the bArg46– bGlu44 interaction, which is
not fully solvent-exposed, and repulsion between
glutamate and the phosphodiester backbone contribute to the WT protein’s discrimination against
the mutant TAG-containing binding site. These
findings agree with the results of a recent
computational study that found DNA deformation
energy could explain indirect readout by IHF only
partly.27
The preference of the WT protein for a T –R step
here correlates with the unusual flexibility of Y – R
steps, as both experimental and computational
studies have suggested that Y – R steps are
unusually flexible21,28,29. It is interesting that, while
several proteins use the flexibility of Y –R steps in
recognizing their cognate sites, some rely on their
propensity to adopt large roll angles, such as
Hin cII,30 TBP,31 and CAP,32 still others, including
IHF and MetJ exploit their propensity to adopt
unusually
high
twists.
Both
properties,
however, reflect an underlying weak stacking in
Y –R steps.
500
In conclusion, our data suggest that sequencedependent DNA structural parameters, especially
twist, play a major role in sequence recognition by
IHF and that rather than directly recognizing individual bases in this portion of its binding site, IHF
recognizes A/T-richness followed by a Y – R step.
While IHF may enforce a narrow minor groove
width upon the TTR element of its binding site,
we find that both the TTG and TAG substrates
were able to satisfy this requirement. However,
within this restraint, the structure reflected the
DNA sequence rather than the protein sequence,
and the energetic penalty for WT protein binding
to the mutant TAG sequence appears to be borne
by the protein rather than the DNA.
Materials and Methods
Materials
WT-IHF was a kind gift from Shu-wei Yang and
Howard Nash (NIH). IHF-bGlu44Ala, cloned into
pET27b (Novagen), was expressed in E. coli strain
JG1246, a derivative of BL21(DE3) lacking functional
IHF genes. Cells were grown in LB broth at 37 8C and
induced by addition of 0.5 mM IPTG at an A600 of ,0.5.
The cells were harvested three hours after induction,
resuspended in 50 mM Tris –HCl (pH 7.5), 10% (w/v)
sucrose, 12.5 mM EDTA, 2.5 mM DTT and lysed by the
addition of 200 mg/ml of lysozyme. Cell debris was
removed by centrifugation at 8000 rpm for ten minutes
in a Sorvall SLA-3000 rotor.
IHF-bGlu44Ala was purified by a new protocol that
we find more efficient than previously published
methods: precipitation by ammonium sulfate followed
by chromatography on heparin agarose and monoS
columns. Solid (NH4)2SO4 was stirred slowly into the
clarified cell lysate in two steps, each followed by
centrifugation at 15,000 rpm for 40 minutes in a Sorvall
SS-34 rotor. IHF remained in the supernatant at 50%
(w/v) (NH4)2SO4 but precipitated at 80% (NH4)2SO4.
The second pellet was redissolved and dialyzed into
buffer consisting of 20 mM Hepes (pH 7.0), 1 mM
EDTA, 400 mM NaCl, 10% (v/v) glycerol. The resulting
solution was applied to a 5 ml Hi-Trap heparin HP
column (Amersham), washed extensively with the same
buffer, and eluted with a gradient from 0 M to 2 M
NaCl. The final step, monoS chromatography, removed
minor contaminants. IHF was first exchanged into buffer
consisting of 10 mM Hepes (pH 7.0), 0.1 mM EDTA,
100 mM NaCl, 8% glycerol and applied to a monoS
column (Amersham). After washing with the same
buffer IHF was eluted with a gradient from 0 M to 2 M
NaCl. All purification steps were carried out at 4 8C.
IHF concentrations were determined using a calculated
extinction coefficient of 5800 mol l21 cm21 at 276 nm.
Protein preparations used in crystallization and binding assays were dialyzed into 10 mM Hepes (pH 7.0),
0.1 M NaCl, 0.1 mM EDTA, 8% glycerol and concentrated using Microcon-3 size-exclusion spin filters
(Amicon). No other band was visible on overloaded
SDS-PAGE, and no nuclease activity was detected when
the sample was incubated with supercoiled DNA and
10 mM MgCl2 for 30 minutes. Oligonucleotides were
purchased from the W.M. Keck Facility at Yale
University (New Haven, CT) and purified by denaturing
Integration Host Factor
PAGE. [g-32P]ATP (6000 Ci mmol21) was purchased from
Amersham. Each DNA oligonucleotide was 50 endlabeled with [g-32P]ATP by phage T4 kinase (Invitrogen)
as described.33 The labeled strands were extracted with
phenol/chloroform/isoamyl alcohol (24:25:1 by vol.)
and were passed through a Bio-Rad P6 spin column.
The complementary oligonucleotides were annealed by
mixing equimolar concentrations of the DNA strands,
heating the mixture to 90 8C, and allowing to cool to
room temperature.
Gel mobility-shift analysis
Binding assays were performed in buffer (50 mM
Tris – HCl (pH 7.5), 100 mM NaCl, 100 mg/ml of bovine
serum albumin (New England Biolabs), 1 mg/ml of
salmon sperm DNA, and 5% glycerol) at 25 8C by
incubation of the 32P-labeled DNA (5 pM) with various
concentrations of IHF (WT or bGlu44Ala) for at least 15
minutes. Each binding reaction was then loaded onto
8% polyacrylamide gels (acrylamide to bis-acrylamide
29:1, w/w) in 0.5 £ TBE (45 mM Tris-borate, 1 mM
EDTA) buffer and electrophoresed at 9 V/cm for 1.5
hours. Dried gels were visualized using PhosphorImager
screens (Molecular Dynamics) scanned by a Molecular
Dynamics PhosphorImager. The band intensities were
quantified by using the volume measurement utility in
the ImageQuant (Molecular Dynamics) software package. The equilibrium dissociation constant (Kd) was
determined using the relationship:
Q21 ¼ 1 þ ðKd =½Pt Þ
where Q is the fraction of bound DNA and Pt equals total
protein concentration.34 Each binding assay was repeated
at least two to four times.
Crystallographic analysis
Crystals were grown in hanging drops by vapor diffusion. A 1:1.5 complex of IHF (, 6.7 mg/ml) and DNA in
10 mM Hepes (pH 7.0), 0.1 M NaCl, 0.1 mM EDTA, 8%
glycerol, was mixed and left for 15 minutes. Well
solution (15% glycerol, 25% PEG5000-MME, 50 mM
Tris – HCl (pH 7.5), 10 mM MgCl2, 50 mM NaCl) was
then added (1:1, v/v) to the protein – DNA mixture and
incubated at 19 8C.
Crystals were mounted directly from the drop in
nylon loops (Hampton Research) and flash-frozen in
liquid propane prior to data collection at BioCARS
beamline 14BM-C or SBC beamline ID19 at the
Advanced Photon Source. Crystals were essentially isomorphous with those reported previously,9 in space
group P212121 with one DNA-bound heterodimer in the
asymmetric unit. The HKL suite was used for data
scaling and reduction.35† Nominal resolution limits for
each direction reflect the point where kI=sI l falls below
2 in a small cone along that axis. Due to the high anisotropy, only data within an ellipsoid having these
principle axes was used in refinement. Program O was
used for model building,36 CNS for refinement,37 Ribbons
for Figures,38 and 3DNA to calculate DNA geometric
parameters.39 The DNA parameter file (dna-rna_rep.
param) supplied with CNS was modified to remove
explicit B-form restraints. The same set of reflections
was reserved for Rfree as had been in the original WT
† http://www.hkl-xray.com
501
Integration Host Factor
structure refinement. In each case, Rfree dropped to 32.4%
or lower after rigid body refinement of the original
model (with ordered solvent removed) against the new
mutant data set. Clear differences reflecting the effects
of the mutations in the protein and/or DNA were visible
in the resulting s-a weighted difference maps. Statistics
of the data and the final models are shown in Table 2.
Protein Data Bank accession numbers
Atomic coordinates have been deposited in the RCSB
Protein Data Bank with ID codes 1OWF (IHFbGlu44Ala/H0 ), 1OWG (WT-IHF/H0 44A), and 1OUZ
(IHF-bGlu44Ala/H0 44A).
Acknowledgements
We thank Ying Zhang for help with initial
crystallization trials and DNA purification, Adam
Conway, Kerren Swinger, and the staff at the BioCARS and SBC beamlines for help with data collection. Use of the Argonne National Laboratory
Structural Biology Center beamlines and the BioCARS Sector 14 beamlines at Advanced Photon
Source was supported by the U.S. Department of
Energy, Basic Energy Sciences, Office of Science,
under Contract No. W-31-109-Eng-38. Use of BioCARS Sector 14 was supported by the National
Institutes of Health, National Center for Research
Resources, under grant number RR07707. This
study was supported by National Institutes of
Health grant GM58827 (P.A.R.).
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
References
1. Nash, H. A. & Robertson, C. A. (1981). Purification
and properties of the Escherichia coli protein factor
required for lambda integrative recombination.
J. Biol. Chem. 256, 9246– 9253.
2. Freundlich, M., Ramani, N. E. M., Sirko, A. & Tsui, P.
(1992). The role of integration host factor in gene
expression in Escherichia coli. Mol. Microbiol. 6,
2557–2563.
3. Friedman, D. I. (1988). Integration host factor: a
protein for all reasons. Cell, 55, 545–554.
4. Goodrich, J. A., Schwartz, M. L. & McClure, W. R.
(1990). Searching for and predicting the activity of
sites for DNA binding proteins: compilation and
analysis of the binding sites for Escherichia coli
integration host factor (IHF). Nucl. Acids Res. 18,
4993–5000.
5. Craig, N. L. & Nash, H. A. (1984). E. coli integration
host factor binds to specific sites in DNA. Cell, 39,
707–716.
6. Yang, C. C. & Nash, H. A. (1989). The interaction of
E. coli IHF protein with its specific binding sites.
Cell, 57, 868– 880.
7. Mengeritsky, G., Goldenberg, D., Mendelson, I.,
Giladi, H. & Oppenheim, A. B. (1993). Genetic and
biochemical analysis of the integration host factor of
Escherichia coli. J. Mol. Biol. 231, 646– 657.
8. Yang, S. W. & Nash, H. A. (1995). Comparison of pro-
19.
20.
21.
22.
23.
24.
tein binding to DNA in vivo and in vitro: defining an
effective intracellular target. EMBO J. 14, 6292– 6300.
Rice, P. A., Yang, S.-W., Mizuuchi, K. & Nash, H. A.
(1996). Crystal structure of an IHF – DNA complex: a
protein-induced DNA U-turn. Cell, 87, 1295 –1306.
Suzuki, M. & Yagi, N. (1995). Stereochemical basis of
DNA bending by transcription factors. Nucl. Acids
Res. 23, 2083– 2091.
Lee, E. C., MacWilliams, M. P., Gumport, R. I. &
Gardner, J. F. (1991). Genetic analysis of Escherichia
coli integration host factor interactions with its bacteriophage lambda H0 recognition site. J. Bacteriol.
173, 609– 617.
Lee, E. C., Hales, L. M., Gumport, R. I. & Gardner, J. F.
(1992). The isolation and characterization of mutants
of the integration host factor (IHF) of Escherichia coli
with altered, expanded DNA-binding specificities.
EMBO J. 11, 305–313.
Hales, L. M., Gumport, R. I. & Gardner, J. F. (1994).
Mutants of Escherichia coli integration host factor:
DNA-binding
and
recombination properties.
Biochimie, 76, 1030– 1040.
Read, E. K., Gumport, R. I. & Gardner, J. F. (2000).
Specific recognition of DNA by integration host factor. Glutamic acid 44 of the beta-subunit specifies
the discrimination of a T:A from an A:T base pair
without directly contacting the DNA. J. Biol. Chem.
275, 33759– 33764.
Fried, M. G. (1989). Measurement of protein– DNA
interaction parameters by electrophoresis mobility
shift assay. Electrophoresis, 10, 366– 376.
Carey, J. (1991). Gel retardation. Methods Enzymol.
208, 103– 117.
Lorenz, M., Hillisch, A., Goodman, S. D. &
Diekmann, S. (1999). Global structure similarities of
intact and nicked DNA complexed with IHF
measured in solution by fluorescence resonance
energy transfer. Nucl. Acids Res. 27, 4619– 4625.
Yang, S. W. & Nash, H. A. (1994). Specific photocrosslinking of DNA – protein complexes: identification of contacts between integration host factor
and its target DNA. Proc. Natl Acad. Sci. USA, 91,
12183 – 12187.
Wang, S., Cosstick, R., Gardner, J. F. & Gumport, R. I.
(1995). The specific binding of Escherichia coli integration host factor involves both major and minor
grooves of DNA. Biochemistry, 34, 13082– 13090.
Chandrasekaran, R. & Arnott, S. (1996). The structure
of B-DNA in oriented fibers. J. Biomol. Struct. Dyn. 13,
1015 –1027.
Olson, W. K., Gorin, A. A., Lu, X.-J., Hock, L. M. &
Zhurkin, V. B. (1998). DNA sequence-dependent
deformability deduced from protein – DNA crystal
complexes. Proc. Natl Acad. Sci. USA, 95,
11163– 11168.
White, S., Szewczyk, J. W., Turner, J. M., Baird, E. E.
& Dervan, P. B. (1998). Recognition of the four
Watson – Crick base pairs in the DNA minor groove
by synthetic ligands. Nature, 391, 468– 471.
Kielkopf, C. L., White, S., Szewczyk, J. W., Turner,
J. M., Baird, E. E., Dervan, P. B. & Rees, D. C. (1998).
A structural basis for recognition of A·T and T·A
base pairs in the minor groove of B-DNA. Science,
282, 111 – 115.
Ades, S. E. & Sauer, R. T. (1995). Specificity of minorgroove and major-groove interactions in a homeodomain –DNA
complex.
Biochemistry,
34,
14601 – 14608.
502
Integration Host Factor
25. Koudelka, G. B. (1998). Recognition of DNA structure by 434 repressor. Nucl. Acids Res. 26, 669– 675.
26. Garvie, V. W. & Phillips, S. E. V. (2000). Direct and
indirect readout in mutant Met repressor– operator
complexes. Structure, 8, 905– 914.
27. Steffen, N. R., Murphy, S. D., Tolleri, L., Hatfield,
G. W. & Lathrop, R. H. (2002). DNA sequence and
structure: direct and indirect recognition in protein –
DNA binding. Bioinformatics, 18, 22S –30S.
28. Mack, D. R., Chiu, T. K. & Dickerson, R. E. (2001).
Intrinsic bending and deformability at the T-A step
of CCTTTAAAGG: a comparative analysis of T-A
and A-T steps within A-tracts. J. Mol. Biol. 312,
1037– 1049.
29. Widom, J. (2001). Role of DNA sequence in nucleosome stability and dynamics. Q. Rev. Biophys. 34,
269– 324.
30. Horton, N. C., Dorner, L. F. & Perona, J. J. (2002).
Sequence selectivity and degeneracy of a restriction
endonuclease mediated by DNA intercalation.
Nature Struct. Biol. 9, 42 – 47.
31. Kim, Y., Geiger, J. H., Hahn, S. & Sigler, P. B. (1993).
Crystal structure of a yeast TBP/TATA-box complex.
Nature, 365, 512–520.
32. Chen, S., Gunasekera, A., Zhang, X., Kunkel, T. A.,
Ebright, R. H. & Berman, H. M. (2001). Indirect readout of DNA sequence at the primary-kink site in the
CAP – DNA complex: Alteration of DNA binding
specificity through alteration of DNA kinking. J. Mol.
Biol. 314, 75 – 82.
33. Brown, B. M. & Sauer, R. T. (1993). Assembly of the
34.
35.
36.
37.
38.
39.
40.
Arc repressor – operator complex: cooperative interactions between DNA-bound dimers. Biochemistry,
32, 1354– 1363.
Robinson, C. R. & Sligar, S. G. (1998). Changes in solvation during DNA binding and cleavage are critical
to altered specificity of the EcoRI endonuclease. Proc.
Natl Acad. Sci. USA, 95, 2186– 2191.
Otwinowski, Z. (1993). Oscillation data reduction
program. In Data Collection and Processing (Sawyer,
L., Isaacs, N. & Bailey, S., eds), pp. 55 – 62, SERC,
Daresbury Laboratory, Warrington.
Jones, T. A., Zou, J. Y., Cowan, S. W. & Kjelgaard, M.
(1991). Improved methods for building protein
models in electron density maps and the location of
errors in these models. Acta Crystallog. sect. A, 47,
110 –119.
Brünger, A. T., Adams, P. D., Clore, G. M., DeLano,
W. L., Gros, P., Grosse-Kunstleve, R. W. et al. (1998).
Crystallography and NMR system: a new software
suite for macromolecular structure determination.
Acta Crystallog. sect. D, 54, 905– 921.
Carson, M. (1991). Ribbons 2.0. J. Appl. Crystallog. 24,
958– 961.
Lu, X.-J. & Olson, W. K. (2000) A-form conformational motifs in ligand-bound DNA structures. J. Mol.
Biol. 300, 819– 840.
Laskowski, R. A., MacArthur, M. W., Moss, D. S. &
Thornton, J. M. (1993). PROCHECK: a program to
check the stereochemical quality of protein
structures. J. Appl. Crystallog. 26, 283– 291.
Edited by K. Morikawa
(Received 12 December 2002; received in revised form 1 April 2003; accepted 15 April 2003)