SUPPLEMENTARY INFORMATION Automated Parameterization of Predictive Kinetic Metabolic Models from Sparse Datasets for Efficient Optimization of Many-Enzyme Heterologous Pathways Sean M. Halper1, Iman Farasat1,3, and Howard M. Salis1,2,† 1 Department of Chemical Engineering and 2Department of Biological Engineering, Pennsylvania State University, University Park, PA 16802 †Corresponding author: [email protected] 3 Current address: Merck Pharmaceuticals, Raritan, NJ Supplementary Figures Figure S1: The measured enzyme expression levels from 23 characterized variants of a 9-enzyme limonene biosynthesis pathway. Enzymes IDI and GGPS were not varied across these 23 variants, and therefore are excluded here. Enzyme levels were directly measured using proteomics as described in Alonso-Gutierrez et. al. (2015). Figure S2: The measured limonene productivities from 23 characterized variants of a 9-enzyme limonene biosynthesis pathway. Limonene titers were measured using GC/MS and calibration curves as described in Alonso-Gutierrez et. al. (2015). Limonene productivities were determined by dividing the final titer by the fermentation time. Figure S3: The entire Pathway Map for the limonene biosynthesis pathway is shown, including all 21 twodimensional slices of the 7-dimensional enzyme expression space. Enzymes IDI and GGPS were not varied in the training dataset, and therefore the Pathway Map does not attempt to predict their optimal expression levels. Figure S4: The first half of the FCC Map for the limonene biosynthesis pathway is shown, including FCCs for selected enzymes across 21 two-dimensional slices of the 7-dimensional enzyme expression space. Enzymes IDI and GGPS were not varied in the training dataset, and therefore the FCC does not attempt to calculate the FCCs for these enzymes. Figure S5: The second half of the FCC Map for the limonene biosynthesis pathway is shown, including FCCs for selected enzymes across 21 two-dimensional slices of the 7-dimensional enzyme expression space. Enzymes IDI and GGPS were not varied in the training dataset, and therefore the FCC does not attempt to calculate the FCCs for these enzymes. Figure S6: Testing the Pathway Map’s accuracy on a test set of 25 in silico pathway variants with enzyme expression levels that fell within the training set’s expression level space. These tests determine whether the Pathway Map is able to correctly predict intermediate productivities by interpolating the pathway’s expression-productivity relationship. Pathway examples are the same as in Figure 3. Error bars indicate the standard deviation of three in silico productivity simulations, including 10% simulated measurement noise. Figure S7: Testing the Pathway Map’s accuracy on a test set of 25 in silico pathway variants with enzyme expression levels that fell outside the training set’s expression level space. These tests determine whether the Pathway Map is able to correctly predict the enzyme expression levels needed for maximal productivities by extrapolating the pathway’s expression-productivity relationship. Pathway examples are the same as in Figure 3. Error bars indicate the standard deviation of three in silico productivity simulations, including 10% simulated measurement noise. Figure S8: The resulting Pathway Maps of a 9-enzyme linear pathway when parameterized using an increasing number of characterized pathway variants. The rows indicate the number of characterized pathway variants used to parameterize the Pathway Map. The columns show two-dimensional plots where the pathway’s productivity is predicted when varying the expression levels of selected pairs of enzymes (enzyme labels correspond to Y/X axes, respectively). The Pathway Maps shown are the best ones (lowest fitting error) after running the Pathway Map Calculator 3 independent times.
© Copyright 2026 Paperzz