Nonlinear regression is an extended linear regression technique in which a nonlinear mathematical model is used to describe the relationship between the response variable and the predictor variables (Bates and Watts 1988). A nonlinear regression model is a model that contains at least one of the parameters in a nonlinear form. An example of a nonlinear model is Y=αXβ+e.
The general form of a nonlinear regression model is Y=η(x,β)+e, where β is a p-component vector of unknown parameters and e is a Normal (0,σ2) error term; that is, like simple linear regression, the residual errors are assumed to have a normal distribution with mean 0 and variance σ2.
Some nonlinear models have linear forms, termed intrinsically linear, which use a transformation of the original data. However, if a nonlinear model is assumed to be appropriate for the data, the nonlinear model should be used for the analysis instead of the linearized form since the normality assumption for the error term will not be appropriate for the transformed data.
The process of fitting nonlinear regression in R is similar to that for fitting linear models except that there is no explicit formula for estimation, so iterative procedures are needed that may also require the user to supply initial estimates of parameters. For more information on nonlinear regression readers can refer to Ratkowsky (1989) and Bates and Watts (1988).
Performing nonlinear regression in R
The command used for nonlinear regression in R is as follows. Please note, this R-code is a summary of how the function works and and will not execute if entered in an R command line!
nls(
formula,
data = parent.frame(),
start, control = nls.control(),
algorithm = "default",
trace = FALSE,
subset,
weights,
na.action,
model = FALSE
)
For full documentation of the nls
function in R, type ?nls
or help(nls)
.
Logistic growth of plant disease
## Input the data that include the variables time,
# plant ID, and severity
time <- c(seq(0,10),seq(0,10),seq(0,10))
plant <- c(rep(1,11),rep(2,11),rep(3,11))
## Severity represents the number of
## lesions on the leaf surface, standardized
## as a proportion of the maximum
severity <- c(
42,51,59,64,76,93,106,125,149,171,199,
40,49,58,72,84,103,122,138,162,187,209,
41,49,57,71,89,112,146,174,218,250,288)/288
data1 <- data.frame(
cbind(
time,
plant,
severity
)
)
## Plot severity versus time
## to see the relationship between
## the two variables for each plant
plot(
data1$time,
data1$severity,
xlab="Time",
ylab="Severity",
type="n"
)
text(
data1$time,
data1$severity,
data1$plant
)
title(main="Graph of severity vs time")
Output
Click on the image for larger version.
Here is an example of obtaining the initial value for the parameters using getInitial
and SSlogis
. SSlogis uses a different parameterization for the logistic model than we have used here, so as a first step we use the parameterization from SSlogis and then convert to our parameterization. (Use help(SSlogis) for more information.)
getInitial(
severity ~ SSlogis(time, alpha, xmid, scale),
data = data1
)
Output
alpha xmid scale
2.212 12.507 4.572
Now fit a logistic curve to the data, converting to our parameterization.
## Using the initial parameters above,
## fit the data with a logistic curve.
para0.st <- c(
alpha=2.212,
beta=12.507/4.572, # beta in our model is xmid/scale
gama=1/4.572 # gamma (or r) is 1/scale
)
fit0 <- nls(
severity~alpha/(1+exp(beta-gamma*time)),
data1,
start=para0.st,
trace=T
)
Output
0.1621433 : 2.2120000 2.7355643 0.2187227
0.1621427 : 2.2124095 2.7352979 0.2187056
Now check the fit of the data with the logistic curve graphed on a scatter plot.
## Plot to see how the model fits the data; plot the
## logistic curve on a scatter plot
plot(
data1$time,
data1$severity,
type="n"
)
text(
data1$time,
data1$severity,
data1$plant
)
title(main="Graph of severity vs time")
curve(
2.21/(1+exp(2.74-0.22*x)),
from=time[1],
to=time[11],
add=TRUE
)
Output
Click on the image for larger version.
From this plot of the logistic curve, it looks like the logistic curve fits the data nicely. There are many nonlinear models that can be used to analyze disease progress data. Some familiar models are discussed in the next section. So, after a scatter plot of the data over time has been constructed and a particular nonlinear model is determined to fit the data, you can use a similar procedure as in this example of fitting the logistic model to a dataset.
Next, modeling disease progress with growth curves