Supplementary Notes PCK110700 -1- Codexis, Confidential Mutations Present in the Final Population of Variants -2- Codexis, Confidential Figure 3 Sequences WT Round 3 Round 9 Round 17 Round 18 (1) (1) (1) (1) (1) (1) WT Round 3 Round 9 Round 17 Round 18 (51) (51) (51) (51) (51) (51) 1 10 20 30 40 50 MSTAIVTNVKHFGGMGSALRLSEAGHTVACHDESFKQKDELEAFAETYPQ MSTAIVTNVKHFGGMGSALRLSEAGHTVACHDESFKHKDELEAFAETYPQ MSTAIVTNVKHFGGMGSALRLSEAGHTVACHDESFKHQDELEAFAETYPQ MSTAIVTNVKHFGGMGSALRLSEAGHTVACHDESFKHQDELEAFAETYPQ MSTAIVTNVKHFGGMGSALRLSEAGHTVACHDESFKHQDELEAFAETYPQ 51 60 70 80 90 100 LKPMSEQEPAELIEAVTSAYGQVDVLVSNDIFAPEFQPIDKYAVEDYRGA LKPMSEQEPAELIEAVTSAFGQVDVLVSNDIFALEFRPIDKYAVEDYRGA LIPMSEQEPAELIEAVTSALGHVDVLVSNDIAPVEWRPIDKYAVEDYRDT LIPMSEQEPAELIEAVNSALGHVDILVSNDIAPVEWRPIDKYAVEDYRDT LIPMSEQEPAELIEAVTSALGHVDILVSNDIAPVEWRPIDEYAVEDYRDM (101) WT (101) Round 3 (101) Round 9 (101) Round 17 (101) Round 18 (101) 101 (151) WT (151) Round 3 (151) Round 9 (151) Round 17 (151) Round 18 (151) 151 (201) WT (201) Round 3 (201) Round 9 (201) Round 17 (201) Round 18 (201) 201 110 120 130 140 150 VEALQIRPFALVNAVASQMKKRKSGHIIFITSATPFGPWKELSTYTSARA VEALQIRPFALVNAVASQMKKRKSGHIIFITSAAPFGPWKELSTYSSARA VEALQIKPFALVNAVASQMKKRKSGHIIFITSAAPFGPWKELSTYSSARA VEALQIKPFALANAVATQMKRRKSGHIIFITSAASFGPWKELSTYASARA VEALQIKPFALANAVASQMKRRKSGHIIFITSAASFGPWKELSTYASARA 160 170 180 190 200 GACTLANALSKELGEYNIPVFAIGPNYLHSEDSPYFYPTEPWKTNPEHVA GASALANALSKELGEYNIPVFAIGPNYLHSEDSPYYYPTEPWKINPEHVA GASALANALSKELGEYNIPVFAIAPNYLHSGDSPYYYPSEPWKTSPEHVA GASALANALSKELGEYNIPVFAIAPNAVDSGDSPYYYPSEPWKTSPEHVA GASALANALSKELGEYNIPVFAIAPNAMDSGDSPYYYPSEPWKTSPEHVA 210 220 230 240 254 HVKKVTALQRLGTQKELGELVAFLASGSCDYLTGQVFWLAGGFPMIERWPGMPE HVKKVTALQRLGTQKELGELVAFLASGSCDYLTGQVFWLAGGFPVIERWPGMPE HVRKVTALQRLGTQKELGELVTFLASGSCDYLTGQVFWLAGGFPVIERWPGMPE WVRKYTALQRLGTQKELGELVTFLASGSCDYLTGQVFWLAGGFPVVERWPGMPE WVRKYTALQRLGTQKELGELVTFLASGSCDYLTGQVFWFAGGFPVVERWPGMPE -3- Codexis, Confidential Detailed Description of a Round In order to give a fuller picture of the decision-making process utilized in the ProSARdriven methodology, here we describe our 14th round of evolution. We had completed our ProSAR analysis on two previous libraries, 12-1 and 12-2 (round 13 was still under analysis). The best variant out of these libraries was chosen as the parent for the next set of libraries with a 1.2-fold improvement over the round 12 parent. The ProSAR model from library 12-1 was of relatively low quality (r=0.31, p=9.37x10-3, where r is the leaveone-out crossvalidated correlation coefficient and p is the frequency of observing such a correlation by chance alone given the null hypothesis of no correlation), so regression coefficients were not weighted heavily for purposes of decision making and all mutations that appeared potentially beneficial were included in the next round, giving seven mutations of interest. (It should be noted that the magnitudes of the regression coefficients are particular to each library and cannot be meaningfully compared across two models without normalization.) Two mutations from this library were in the chosen backbone and appeared that they may be detrimental; these positions were allowed to mutate back to their original residue (flip-out) in the next library. There were two other mutations in the backbone that appeared positive, but due to the lack of confidence in this model they were also allowed to mutate back to the original residue in the next library. All told, 12 of the initial 15 mutations in 12-1 were tested in the next library. Library 122 gave a better model (r=0.47, p=2.51x10-4) and revealed four mutations that were either neutral or beneficial. This round of evolution was at a point where we were running low on mutations of interest and so we had completed multiple saturation mutagenesis libraries (sat. mut.) within the binding pocket and at positions that had previously shown influence on activity. These libraries gave us 18 mutations worth pursuing further. We had also hit-shuffled three of our best variants and completed ProSAR analysis of this library (Hit Shuffle 15). This analysis provided an additional five mutations of interest. In total, these libraries provided 39 mutations to test in further combinatorial libraries as shown in Tables 3 and 4. We split these mutations into two libraries: 14-1 with 19 mutations and 14-2 with 20 mutations. The sequence of the backbone and the oligonucleotides used to construct the libraries are listed after Tables 3 and 4. Both libraries were analyzed with ProSAR and gave relatively high quality models (14-1: r =0.58, p=7.9x10-6, 14-2: r=0.71, p=4.55x1010 ). The next library’s parent came from 14-1 and had three mutations with high regression coefficients. Four of the mutations in the parent had negative regression coefficients and so were allowed to mutate back to the original residue in the next library. Two more mutations were positive, but not in the backbone so were included in the next library. Library 14-2 provided 14 mutations that were neutral to beneficial. All told this resulted in three positive mutations fixed in the new backbone, 20 mutations to be tested in the next set of libraries, and 16 mutations removed from consideration. -4- Codexis, Confidential Library 14-1 Fold improved of the highest activity variant with the mutation Notes - L, 1.20 mutated back 0.257 yes D121K T152A 12-1 12-1 + D, 1.20 T, 1.20 mutated back mutated back -0.27 -0.33 F177Y Q38L 12-1 sat. mut. + F, 1.20 1.25 mutated back 0.152 no, A was better than Y or F -0.12 yes flip-out S78N T100M sat. mut. sat. mut. 1.25 1.04 -0.010 0.169 V101I F177A sat. mut. sat. mut. 1.70 1.70 -0.31 0.59 yes W238R T67N sat. mut. sat. mut. 1.25 1.20 -0.13 0.00 yes flip-out G181W V205Y sat. mut. sat. mut. 1.17 1.16 -0.24 -0.11 yes flip-out A114Q D99G sat. mut. Hit Shuffle 15 0.07 1.15 0.98 -0.21 0.003 V112A W139D Hit Shuffle 15 0.05 Hit Shuffle 15 0.08 0.98 0.98 0.033 yes -0.44 N176R W238C Hit Shuffle 15 0.06 Hit Shuffle 15 0.03 0.98 0.96 -0.02 -0.12 yes In Next Library? Previous Regression Coefficient 12-1 In Next Backbone? Previous Library Regression Coefficient Mutation L10K yes yes flip-out Table 3 – 14-1 Library Design. The source of each mutation is given by the previous library it was observed in along with any regression coefficient information from ProSAR analysis. In some cases mutations present in the backbone were allowed to vary back to the previous residue (mutated back) because we were unsure about their impact on function or believed the mutation may be deleterious. The regression coefficient for the mutation in the context of the new library is given along with an indication of its presence in the new backbone and whether it is part of the next round library design. -5- Codexis, Confidential Library 14-2 D121E V202L 12-1 12-1 + + V245A P135S 12-1 12-2 M252V E40V mutated back In Next Library? Fold improved of the highest activity variant with the mutation T, 1.2 0.92 In Next Backbone? Previous Regression Coefficient + + Regression Coefficient Previous Library 12-1 12-1 Notes Mutation T152A E95G -0.030 -0.120 yes 0.89 1.17 0.323 -0.500 yes + 0.001 1.01 1.14 -0.990 0.651 12-2 12-2 -0.001 0.066 1.00 1.12 -0.050 -0.260 yes A60V R87Q 12-2 12-1 0.090 + 1.02 0.93 random mutation 0.051 -0.320 yes S146A T100A 12-1 12-1 + + 0.93 1.12 0.132 random mutation 0.068 yes yes S180T T144S sat. mut. sat. mut. 1.29 1.09 0.499 0.166 yes yes G251E M54I sat. mut. sat. mut. 1.04 1.03 0.159 -0.020 yes yes D121R G251S sat. mut. sat. mut. 1.03 1.18 0.119 0.087 yes yes W238T sat. mut. 1.01 1.259 yes I52T sat. mut. 1.01 -0.260 yes Table 4 – 14-2 Library Design. The source of each mutation is given by the previous library it was observed in along with any regression coefficient information from ProSAR analysis. In some cases mutations present in the backbone were allowed to vary back to the previous residue (mutated back) because we were unsure about their impact on function or believed the mutation may be deleterious. In some cases random mutations appeared in the combinatorial library and were included in the next library design when they appeared potentially beneficial. The regression coefficient for the mutation in the context of the new library is given along with an indication of its presence in the new backbone and whether it is part of the next round library design. -6- Codexis, Confidential Round 14 Backbone and Oligonucelotides Used in Library Constructions The oligonucleotides listed cover a defined region of the backbone and set of mutations desired in that region. In some cases, multiple oligonucleotides were required in order to allow for all combinations of mutations in a targeted region, e.g. V112A and A114Q are collectively coded by two oligos (aagccatttgctctagyaaatgccgtcgcttcgcaaatg and aagccatttgctctagyaaatcaggtcgcttcgcaaatg) though we do not further indicate which mutation is carried by a particular oligonucleotide though this information can be deduced by inspection. Round 14 Backbone: atgagcaccgctattgtcaccaacgtcctgcattttggaggtatgggtagcgctctgcgtctgagcgaagctggtcata ccgtcgcttgccatgatgaaagctttaagcatcaggatgaactagaagcttttgctgaaacctacccacagctgatacc aatgagcgaacaggaaccagctgaactgattgaagctgtcaccagcgcccttggtcatgtcgatatcctggtcagcaac gatatcgcgcctgtggaatggcggccaatcgataaatacgctgtcgaggattacagggatactgtcgaagctctgcaga tcaagccatttgctctagtgaatgctgtcgcttcgcaaatgaaggatcgaaagtcggggcacatcatcttcatcacttc ggctgccccgttcgggccatggaaggagctatcgacttactcttcggctcgagctgggaccagtgcactagctaatgct ctatcgaaggagctaggagagtacaatatcccggtgttcgctatcgctccgaattttctagactcgggggattcgccgt actattacccctctgagccgtggaagacttctccggagcacgtggctcacgtgcgtaaggtgactgctctacaacgact agggactcaaaaagagttgggggaattggtgacgtttttggcatctggctcttgtgattatttgactggccaggtgttt tggttggcaggcggctttcccgttgtagagcgttggcccggcatgcccgaataatga 14-1 Oligos: attgtcaccaacgtcaagcattttggaggtatg (L10K) gaaagctttaagcatctggatgaactagaagct (Q38L) ctgattgaagctgtcaatagcgcccttggtcat (T67N) gtcgatatcctggtcaacaacgatatcgcgcct (S78N) gtcgaggattacagggrcaygrtcgaagctctgcagatc (D99G, T100M, V101I) aagccatttgctctagyaaatgccgtcgcttcgcaaatg (V112A, A114Q) aagccatttgctctagyaaatcaggtcgcttcgcaaatg (V112A, A114Q) gcttcgcaaatgaagaaacgaaagtcggggcac (D121K) gccccgttcgggccagataaggagctatcgact (W139D) tcggctcgagctggggcgagtgcactagctaat (T152A) ttcgctatcgctccgcgttwtctagactcgkgggattcgccgtactat (N176R, F177YA, ttcgctatcgctccgcgtgccctagactcgkgggattcgccgtactat (N176R, F177YA, ttcgctatcgctccgaactwtctagactcgkgggattcgccgtactat (N176R, F177YA, ttcgctatcgctccgaacgccctagactcgkgggattcgccgtactat (N176R, F177YA, gctcacgtgcgtaagtacactgctctacaacga (V205Y) actggccaggtgtttygtttggcaggcggcttt (W238CR) G181W) G181W) G181W) G181W) 14-2 Oligos: tttaagcatcaggatgtgctagaagcttttgct (E40V) acctacccacagctgaytccaatkagcgaacaggaacca (I52T, M54I) agcgaacaggaaccagttgaactgattgaagct (A60V) gcgcctgtggaatggcaaccaatcgataaatac (R87Q) atcgataaatacgctgtcggcgattacagggat (E95G) gattacagggatgccgtcgaagctctgcagatc (T100A) gcttcgcaaatgaaggaacgaaagtcggggcac (D121RE) gcttcgcaaatgaagcgccgaaagtcggggcac (D121RE) atcacttcggctgccagcttcgggccatggaag (P135S) tggaaggagctatcgasttackcttcggctcgagctggg (T144S, S146A) tcggctcgagctggggccagtgcactagctaat (T152A) gagcacgtggctcacctgcgtaaggtgactgct (V202L) actggccaggtgtttactttggcaggcggcttt (W238T) gcaggcggctttcccgcggtagagcgttggccc (V245A) gtagagcgttggcccrgcrtgcccgaataa (G251SE, M252V) gtagagcgttggcccgaartgcccgaataa (G251SE, M252V) -7- Codexis, Confidential
© Copyright 2026 Paperzz