Interdisciplinary Center for Scientific Computing (IWR), University of Heidelberg,
Im Neuenheimer Feld 368, 69120 Heidelberg, Germany
Received: 20th April 2004 / Published: 1st October 2004
This paper deals with the identification of kinetic parameters in enzyme catalytic processes. Experience shows that the experiments performed do no deliver measurement data sufficient for the identification of parameters. New optimal experiments are needed. We suggest effective algorithms and software for parameter estimation and design of optimal experiments, based on multiple shooting and specially tailored, structure-exploiting reduced Gauss-Newton and SQP methods. The methods are applied to optimal experimental design and parameter estimation in enzyme catalytic processes.
Parameter estimation problems in dynamic processes
The problem of identification of unknown parameters in dynamic models is among the most important tasks in mathematical modelling of dynamic processes. It can be described as follows. Let the dynamics of the model be described by a system of ordinary differential equations

where the right-hand side f depends on an vector of unknown parameters
, given control functions u :
and given control variables
. It is assumed, that at the given times tj, j = 1,..., Μ, measurements ηij, i = 1,..., Mj, j = 1,..., M, of the observation functions bij are available



As an objective functional in the optimization problem, typically a norm of the measurement errors is used. The type of the norm is motivated by the statistical distribution of the measurement errors. If the errors are independent, normally distributed with zero mean and known variances
, minimizing a weighted least squares function

yields a maximum likelihood estimate.
Summing up, mathematically the problems of parameter estimation can be written as follows:
Minimize the deviation of model response to measurement values such that the dynamics and the inital conditions of the dynamic process are fulfilled and possible further constraints are satisfied:

Model for enzyme catalytic processes
We consider parameter estimation for the catalytic reaction of enzymes [3]. The native (N), unfolded (U), and deactivated (D) enzymes are involved in the reaction which takes place in a reactor with a continuous inflow and a corresponding outflow. The reaction scheme is the following

where kNU, kUN, kUD, kND denote rates of the corresponding reactions. Additionally a substrate S is involved, which is contained in the inflow and is degraded depending on the concentration of native enzymes.
Our mathematical model of this reaction scheme is based on the following assumptions:

where CE0 denotes an initial amount of active enzymes.

with some constant KU.

where CS is the concentration of the substrate, km is the Michaelis-Menten constant, rmax is the maximal rate of the biochemical reaction, kr is a proportionality coefficient depending on the temperature T.

where R is the gas constant.
In particular:

where

where V ist the volume of the reactor,
is the rate of the substrate inflow, C0S is the inflow concentration of the substrate.
Taking into account these assumptions one can derive the mathematical model for the enzyme catalytic reaction

with

Summing up, the catalytic reaction is modelled by a system of two differential equations (3) where

The process may be controlled by the temperature, this means that the control function u is a temperature profile T.
For numerical stability and better scaling we reformulate the Arrhenius kinetic terms (2) as follows:

where T1 and T0 are some temperature values to be chosen by the use, e.g. maximal and minimal temperature used in the experiments. Physically the coefficients k1new and k2new describe the rate of the corresponding reaction at T1 and T0 respectively.
We use this transformation for parameters describing reactions
and
:


For the reader's convenience, we give the explicit relation between "old" and "new" parameters:
![]() |
![]() |
Half-life
The important stability feature of enzymes is characterized by half-life [2].
Half-life (HL) is a time required to reduce the amount of a native enzyme to a half of the initial amount at a constant temperature.
Mathematically half-life is given by

Measurement function
The observation function - velocity of consumption of base necessary to neutralize the acidic reaction product - is given by the dosage of base

where B0 is the given concentration of base. Note that in the model under consideration we measure one quantity at a time.
Multiple shooting
The scheme of multiple shooting consists of the following. First one chooses a suitable grid of multiple shooting nodes τj

covering the interval where measurements are given.

Figure 1. Multiple shooting approach.
At each grid point the values of the state variables sj are chosen as additional unknowns and m ODE initial value problems

are solved on each subinterval Ij:=[τj, τj+1] to yield a solution y(t; sj, p, q, u) for
. Solutions of dynamic systems, generated by this procedure, are usually not continuous at τj. This has to be enforced by additional matching conditions. Inserting the computed values y(t, sj, p, q, u),
into problem (1) one obtains a constrained optimization problem in the variables (s, p) := (s0,...,sm, p)

Multiple shooting possesses several advantages:
1. It is possible to include a priori information about the state variables, e.g. from the measurements or from expert knowledge, by a proper choice of initial guesses for the additional variables sj. Thus, it can be ensured that the initial solutions y(t;sj, p, u) remain close to the observed data. It can be shown that this damps the influence of poor parameter guesses.
2. The adequate choice of initial guesses for the state variables (and the application of a Gauss-Newton method for the solution of the constrained least squares problem) typically avoids convergence to local minima with large residuals.
3. The scheme is numerically stable. The splitting of the integration interval limits error propagation and makes it possible to solve parameter estimation problems even for unstable or chaotic systems.
4. The BVP discretization induces a very specific structure in the problem equations, which can be exploited in particular for parallelization.
Gauss-Newton method
Parametrization of the dynamics yields a finite dimensional, possibly large-scale, nonlinear constrained least squares problem which can be formally written as

Note, that the equalities F2(χ) = 0 include the matching conditions induced by multiple shooting. We assume that the functions Fi :
, i = 1, 2, are twice continuously differentiable. The number of variables in problem (5) is equal to number of differential equations multiplied by the number of multiple shooting nodes plus the number of parameters. To solve problem (5) we use a generalized Gauss-Newton method according to which a new iterate is (basically) generated by

where the increment Δχ is the solution of the following linear constraint l2 problem at χ = χk

Here, J1(χ) and J2(χ) denote the Jacobians of F1(χ) and F2(χ) respectively.

then a linearized problem (7) has a unique solution Δχk and a unique Lagrange vector λk satisfying the following optimality conditions

Using (8) one can easily show that Δχk can be formally written with the help of a solution operator J+

The solution operator J+ is a generalized inverse, that is it satisfies J+J J+ = I, and is explicitly given by

Choosing the step lenght tk by means of classical line search methods based on the exact penalty function

with sufficiently large weights αi < 0, i = 1,..., m2, ensures global convergence. However, it is well known that already in mildly ill-conditioned problems such a step-size strategy may be very inefficient since it may produce very small step-sizes. Therefore we use the "restrictive monotonicity test", see [5,6], that has proved to be very effective in practical applications.
Note, that the generalized Gauss-Newton method (6) - (7) has several advantages. First it does not use second order derivative information and the local linearized problems are linear constrained least squares problems. Under certain regularity assumptions at the solution, the method shows a good linear rate of local convergence. There are problems, however, for which the Gauss-Newton method may have a rather slow local convergence rate or may even fail. The reason is that the linearized model (7), which forms the basis of the Gauss-Newton method, is an inadequate representation of such nonlinear problems, since the second-order information cannot be ignored. These problems are called problems with large residuals.
Using SQP-type methods for the nonlinear constrained l2 parameter estimation problem one could force convergence to a solution even in such a case. However, such solutions are undesirable in a certain sense. We can show that a solution of this type, even if it is a strict minimum of the nonlinear constrained l2 problem (5), cannot be expected to be a continuous deformation of the "true" parameter values under perturbations caused by the measurement errors. Thus, slow local convergence of the full-step
Gauss-Newton indicates deficiencies in the model or lack of data and can be considered as an advantage of the method. For a detailed analysis see [5].
Solving the linear l2 problem
At each iteration of a Gauss-Newton method a linear least squares problem (7) has to be solved.
First, we make use of the special block structure of the Jacobian J(χ) which is induced by multiple shooting:

Every block column corresponds to the derivatives with respect to the discretization variables and parameters in one subinterval. The block rows with G-matrices are the derivatives of the continuity conditions

The block rows with D-matrices correspond to the derivatives of the functions Fi of the cost functional and the constraints of the nonlinear problem (5) excluding the continuity conditions. We use a fast, stable and efficient structure exploiting decomposition (see [4, 5]) to reduce the large linear least squares problem to a linear least squares problem with smaller dimension.
The number of variables in the resulting problem is equal to the number of parameters plus the number of differential equations. This so-called condensed problem may be solved using the methods described in [7].
Statistical sensitivity analysis for the estimates
An important question in parameter estimation is how good the computed estimates are. The answer is provided by sensitivity analysis. If the experimental data is normally distributed then the estimated solution χ* of the parameter estimation problem is also a random variable which is normally distributed in the first order χ* ~ N(χtrue, C) with the (unknown) true value χtrue as expected value and the variance-covariance matrix C given by

The variance-covariance matrix describes the confidence ellipsoid which is an approximation of the nonlinear confidence region of the estimated variables.
The 100(1- a)% linearized confidence ellipsoid
can be described by (see [5])

Here, the probability factor
where
is the quantile of the χ2 distribution. The linearized confidence ellipsoid GL(α,χtrue,q,u) is contained exactly in a box determined by the confidence intervals

where
. Here, Cii denotes the diagonal elements of the covariance matrix C. The values
are known as standard deviations of the variables χi.
Results of parameter estimation for Candida antarctica lipase on ionic resin ("Novozym")
Applying the methods of parameter estimation described earlier, "Novozym" shows that the data derived from the initial experiment does not deliver enough information to identify all parameters, see Table 1. The settings of the initial experiment are defined by the temperature shown in Fig. 2. The parameter estimation results show that the information received in the experiment is not at all enough to estimate parameters reliably. Thus we need to design additional experiments in order to provide sufficiently good data. The theoretical justification and methods for design of optimal experiments are briefly described in the next section.
Table 1. Estimated values of parameters ± standard deviation after parameter estimation.
Initial temperature profile
p1
27.86 ± 4.42
p2
48.98 ± 10.92
p3
1.73 ± 2.39 x 105
p4
634.20 ± 806.00 x 106
p5
-1.43 ± 1.50 x 107
p6
-7.50 ± 4.16 x 107
p7
-4.15 ± 0.091
p8
-8.63 ± 2.00

Figure 2. Initial temperature profile
In this section we consider the problem of nonlinear optimum experimental design and discuss a solution approach to this problem.
Experimental design optimization problem
Since the experimental data is randomly distributed, the estimated parameters are also random variables. Data evaluated under different experimental conditions leads to estimations of parameters with good or poor confidence regions depending on the experimental conditions. We would like to find those experiments that result in the best statistical quality for the estimated parameters and at the same time satisfy additional constraints e.g. experimental costs, safety, feasibility of experiments, validity of the model etc.
The aim of optimum experimental design is to construct Nex experiments by choosing appropriate experimental variables
q = (q1,...,qNex), and experimental controls u = (u1,..., uNex) in order to maximize the statistical reliability of the unknown variables under estimation. For this purpose we optimize a design criterion depending on the variance-covariance matrix C (9).
The optimum experimental design approach leads to an optimal control problem in ODE systems of the following form

The constraint (11) describes control and path constraints for each experiment of Nex experiments. Free variables of the optimization problem are the control profiles ui(t) (e.g. temperature profiles of cooling/heating) and the time-independent control variables qi (e.g. initial concentrations, properties of the experimental device) for all experiments.
Since the "size" of a confidence region is described by variance-covariance matrix C any suitable function of matrix C has to be taken as a cost functional. Typical choices are

Numerical methods
The experimental design optimization problem is a nonlinear constrained optimal control problem. The main difficulty lies in the non-standard objective function which is nonseparable and implicitly defined on the sensitivities of the underlying parameter estimation problem, i.e. on the derivatives of the solution of the ODE system with respect to the parameters and initial values.
The numerical methods are based on the direct approach, according to which the control functions are parametrized on an appropriate grid by support functions, the solution of the ODE systems and the state constraints are discretized. As a result we obtain a finite-dimensional constrained nonlinear optimization problem which is solved by an SQP method. The main effort for the solution of the optimization problem by the SQP method is spent on the calculation of the values of the objective function and the constraints as well as its gradients.
Efficient methods for derivative computations combining internal numerical differentiation [5] of the integration method and automatic differentiation of the model functions [8] have been developed, see [9,10,11]. For more detailed discussion of the numerical methods for nonlinear optimum experimental design see [11,12].
Results of parameter estimation for Candida antarctica lipase on ionic resin ("Novozym") with data from additional experiments
Using the methods of optimum experimental design we have optimized 5 additional temperature profiles. The following set-up has been chosen for the experiments: the duration of each experiment is 20 h; measurements are taken every half-hour; the temperature profile is parametrized as follows

Additionally, there are the following restrictions on the design parameters Ti and slopei.

Results of parameter estimation for "Novozym'' with data from the initial and additional experiments are presented in Tables 2 and 3. Table 3 shows the estimated values of half-life over an interesting temperature range. The optimal temperature profiles are presented in Fig. 3. Figure 4 presents the fits before and after parameter estimation. The results of parameter estimation show that now we can estimate all half-lives with standard deviation below 30 %.
Table 2. Estimated values of parameters ± standard deviation after parameter estimation.
|
|
Initial profile |
5 designed + 1 initial profile |
|
p1 |
27.86 ± 4.42 |
27.69 ± 0.81 |
|
p2 |
48.98 ± 10.92 |
49.96 ± 1.97 |
|
p3 |
1.73 ± 2.39 x 105 |
0.548 ± 0.075 |
|
p4 |
634.20 ± 806.00 x 106 |
184.88 ± 26.10 |
|
p5 |
-1.43 ± 1.50 x 107 |
-4.32 ± 0.26 |
|
p6 |
-7.50 ± 4.16 x 107 |
-6.20 ± 2.20 |
|
p7 |
-4.15 ± 0.091 |
-8.88 ± 1.59 |
|
p8 |
-8.63 ± 2.00 |
-11.34 ± 7.54 |
Table 3. Estimated values of half-life ± standard deviation after parameter estimation.
|
Temperature [°C] |
5 optimal + 1 initial profiles |
|
48.0 |
1912.01 ± 574.2 |
|
50.0 |
1239.43 ± 266.8 |
|
52.0 |
800.68 ± 121.8 |
|
54.0 |
520.17 ± 57.42 |
|
56.0 |
342.81 ± 30.2 |
|
58.0 |
231.09 ± 17.4 |
|
60.0 |
160.53 ± 10.4 |

Figure 3. Optimal additional temperature profiles.
Figure 4A. Model (straigth line) vs. measurements (dashed line) before parameter estimation.

Figure 4B. Model (straigth line) vs. measurements (dashed line) after parameter estimation.
As we can see the experimental design optimization problem (10), (11), (12) is formulated for the assumed parameter values which are, however, only known to lie in a possibly large confidence region. In this section we discuss how to construct robust experiments, that is experiments that are less sensitive to parameter uncertainty. We assume that the parameters are lying in the following ellipsoid around the nominal values of parameters p0

Here Σ is a positive definite matrix. To get a robust experimental design we formulate a worst-case problem, minimizing over the design variables ξ, the maximal value of φ(C) over the ellipsoid E

In this problem, we first consider constraints which do not depend on the parameters. The remaining (control) constraints are summarized by
. The optimization problem (13) is a semi-infinite programming problem. The solution methods for such problems require the determination of global optima of nonlinear subproblems [13] which may be computationally too expensive. In order to compute robust designs we suggest a modified approach. We apply Taylor expansion w.r.t. p to the min-max objective function:

The inner problem is the maximization of a linear function subject to a convex quadratic constraint which can be solved explicitly:

This leads to the robust experimental design optimization problem

The second term in the cost function can be interpreted as a penalty for uncertainty in the parameters.
Parameter dependent constraints are treated in an analogous way. We substitute the nonrobust constraints

with the following constraints for the robust experimental design optimization problem.

Numerical results
In this section we present numerical results on comparing robust and nonrobust designs for enzyme reaction kinetics. To illustrate the performance of the method, experiments for identification of 4 out of 8 parameters were designed in the sequential mode. The idea of the sequential approach is schematized in Fig. 5.

Figure 5. Sequential approach to experiment design.
For the "true" values of the parameter p1true = 27.77, p2true = 50.15, p3true = 0.55, p4true = 185.25 the normally distributed data were simulated using initial temperature profile as shown in Fig. 2.
The results of parameter estimation based on these data
p1 = 48.61 ± 20.59
p2 = 103.58 ± 52.77
p3 = 0.58 ± 0.10
p4 = 189.69 ± 32.50
were used for the computation of a nonrobust and a robust designs. The diagonal matrix with diagonal elements of the covariance matrix corresponding to the solution of parameter estimation problem was chosen as the matrix Σ in the robust design. Using the new optimized temperature profiles the data were simulated, parameter estimation was repeated with the new data and further new designs were constructed using the new estimates for the parameters. This procedure was repeated until we obtained trustworthy parameter estimates, see Table 4.
Table 4. Results of parameter estimation for initial design and 5 non-robust experiments
(second column) and for initial desgin and 2 robust experiments (third column).
|
"true" values |
|
pi ± |
|
pi ± |
|
p1 = 27.77 |
|
27.44 ± 0.743 |
|
27.63 ± 1.266 |
|
p2 = 50.15 |
|
49,33 ± 1.980 |
|
49.71 ± 3.330 |
|
p3 = 0.55 |
|
0.620 ± 0.068 |
|
0.512 ± 0.072 |
|
p4 = 185.25 |
|
209.85 ± 24.35 |
|
172.26 ± 24.41 |
The sequential design of robust experiments after 2 designs provide us with reliable parameter estimates, while we need 5 nonrobust experiments to get similar quality of the estimates. Although a computational time for computing a robust experiment is significantly greater than the time to compute a nonrobust experiment, clearly application of the robust sequential design makes it possible to reduce the number of real experiments and thus to reduce drastically experimental costs and the time necessary to identify the parameters.
Application of advanced methods for parameter estimation in dynamic processes governed by ODE and for design of robust optimal experiments allows reliable estimation of enzyme operating stability and the reduction of the number of experiments as well as experimental costs.
We thank the Deutsche Forschungsgemeinschaft (DFG) for financial support through SFB 359. The part of the research was made in cooperation with Degussa AG.
[1] Gupta, M.N. (1993) Thermostability of Enzymes. Springer, New York.
[2] Boy, M., Dominik, A., Vos, H. (1998) A method for fast determinationof biocatalyst process stability. Chem. Engng Technol. 21: 570-575.
[3] Bommarius, A., Estler, M., Kluge, A., Werner, H., Vollmer, H., Bock, H.G., Schlöder, J.P., Kostina, E. (2001) Method to determine the process stability of enzymes. Patent Application EP 1 067 198 A1, European Patent Office, Patentblatt 2.
[4] Bock, H.G. (1981) Numerical treatment of inverse problems in chemical reaction kinetics. In: Modelling of Chemical Reaction Systems.(Ebert, K.H. Deuflhard, P., Jäger, W. Eds) Springer Series in Chemical Physics 18, Heidelberg.
[5] Bock, H.G. (1987) Randwertproblemmethoden zur Parameteridentifizierung in Systemen nichtlinearer Differentialgleichungen. Bonner Mathematische Schriften 183: 264.
[6] Bock, H.G., Kostina, E., Schlöder, J.P. (2000) On the role of natural level functions to achieve global convergence for damped Newton methods. In: System Modelling and Optimization: Methods, Theory and Applications. (Powell, M.J.D., Scholtes, S. Eds) pp. 51-74. Kluwer, Boston.
[7] Stoer, J. (1971) On the numerical solution of constrained least squares problems. Siam J. Numer. Anal. 8(2): 382-411.
[8] Griewank, A. (2000) Evaluation Derivatives. Principles and Techniques of Algorithmic Differentiation. Frontiers in Applied Mathematics. SIAM
[9] Bauer, I., Bock, H.G., Körkel, S., Schlöder , J.P.(1999) Numerical methods for initial value problems and derivative generation for DAE models with application to optimum experimental design of chemical processes. In: Scientific Computing in Chemical Engineering II. (Keil, F., Mackens, W., Voss, H., Werther , J.Eds) 2, pp. 282-289, Springer, Berlin.
[10] Bauer, I., Bock, H.G. Körkel, S., Schlöder, J.P. (2000) Numerical methods for optimum experimental design in DAE systems. J. Comput. Appl. Math. 120:1-25.
[11] Körkel, S.(2002) Numerische Methoden für Optimale Versuchsplanungsprobleme bei nichtlinearen DAE-Modellen. PhD thesis, Universität Heidelberg.
[12] Körkel, S., Bauer,I., Bock, H.G., Schlöder, J.P. (1999) A sequential approach for nonlinear optimum experimental design in DAE systems. In: Scientific Computing in Chemical Engineering II. (Keil, F., Mackens,W., Voss, H., Werther, J.,Eds) 2, pp. 338-345, Springer, Berlin.
[13] R. Reemtsen, R., Rückmann, J.-J. (Eds.) (1998) Semi-Infinite Programming. Nonconvex Optimization and its Applications. Kluwer, Boston.