How to Draw Polinomial Linear Regression in R

In 1981, n = 78 bluegills were randomly sampled from Lake Mary in Minnesota. The researchers (Cook and Weisberg, 1999) measured and recorded the following data (Bluegills dataset):

Response \(\left(y \right) \colon\) length (in mm) of the fish
Potential predictor \(\left(x_1 \right) \colon \) age (in years) of the fish

The researchers were primarily interested in learning how the length of a bluegill fish is related to it age.

A scatter plot of the data:

scatter plot

suggests that there is positive trend in the data. That is, not surprisingly, as the age of bluegill fish increases, the length of the fish tends to increase. The trend, however, doesn't appear to be quite linear. It appears as if the relationship is slightly curved.

One way of modeling the curvature in these data is to formulate a "second-order polynomial model" with one quantitative predictor:

\(y_i=(\beta_0+\beta_1x_{i}+\beta_{11}x_{i}^2)+\epsilon_i\)

where:

\(y_i\) is length of bluegill (fish) \(i\) (in mm)
\(x_i\) is age of bluegill (fish) \(i\) (in years)

and the independent error terms \(\epsilon_i\) follow a normal distribution with mean 0 and equal variance \(\sigma^{2}\).

You may recall from your previous studies that "quadratic function" is another name for our formulated regression function. Nonetheless, you'll often hear statisticians referring to this quadratic model as a second-order model, because the highest power on the \(x_i\) term is 2.

Incidentally, observe the notation used. Because there is only one predictor variable to keep track of, the 1 in the subscript of \(x_{i1}\) has been dropped. That is, we use our original notation of just \(x_i\). Also note the double subscript used on the slope term, \(\beta_{11}\), of the quadratic term, as a way of denoting that it is associated with the squared term of the one and only predictor.

The estimated quadratic regression function looks like it does a pretty good job of fitting the data:

estimated quadratic regression function

To answer the following potential research questions, do the procedures identified in parentheses seem reasonable?

How is the length of a bluegill fish related to its age? (Describe the nature — "quadratic" — of the regression function.)
What is the length of a randomly selected five-year-old bluegill fish? (Calculate and interpret a prediction interval for the response.)

Among other things, the Minitab output:

Analysis of Variance

Source	DF	Adj SS	Adj MS	F-Value	P-Value
Regression	2	35938.0	17969.0	151.07	0.000
age	1	8252.5	8252.5	69.38	0.000
age^2	1	2972.1	2972.1	24.99	0.000
Error	75	8920.7	118.9
Lack-of-Fit	3	108.0	360	0.29	0.829
Pure Error	72	88121.7	122.4
Total	77	44858.7

Model Summary

S	R-sq	R-sq(adj)	R-sq(pred)
10.9061	80.11%	79.58%	78.72%

Coefficients

Term	Coef	SE Coef	T-Value	P-Value	VIF
Constant	13.6	11.0	1.24	0.220
age	54.05	6.49	8.33	0.000	23.44
age^2	-4.719	0.944	-5.00	0.000	23.44

Regression Equation

length = 13.6 + 54.05 age - 4.719 age^2

Predictions for length

Variable	Setting
age	5
age^2	25

Fit	SE Fit	95% CI	95% PI
165.902	2.76901	(160.386, 171.418)	(143.487, 188.318)

tells us that:

80.1% of the variation in the length of bluegill fish is reduced by taking into account a quadratic function of the age of the fish.
We can be 95% confident that the length of a randomly selected five-year-old bluegill fish is between 143.5 and 188.3 mm.

How to Draw Polinomial Linear Regression in R

Source: https://online.stat.psu.edu/stat501/lesson/9/9.8

How to Draw Polinomial Linear Regression in R

Analysis of Variance

Model Summary

Coefficients

Regression Equation

Predictions for length

0 Response to "How to Draw Polinomial Linear Regression in R"

Post a Comment

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel