Regression Equations
The regression equations express the regression line. As there are two regression lines so there are two regression equations. The regression equation X and Y describes the variation in the values of X for the given changes in Y. And used for estimating the values of X for the given value of Y. Similarly the regression equation Y and X describes the variation in the values of Y for the given changes in X and is used for estimating the value of Y for the given value of X.
Regression Equation of Y on X
The regression equation of Y on X is expressed as follows:
Y= a + b X
It may be noted that in this equation y is a dependent variables i , its values depends on X....X is independent variables i, e, we can take a given values of X and compute the values of Y.
A is y intercept because its values is the point at which the regression line cross the Y axis that is the vertical axis b is the slope of line. It represents changes in Y variable for a unit change in X variable.
A and b in the equation are called numerical constants because for any given straight line their value does not change.
If the values of the constants a and b are obtained the line is completely determined. But the question is how to obtain these values. The answer is provided by the method of least squares which states that the line should be drawn through the plotted points in such a manner that the sum of the square of the deviations of the actual y values from the computed Y values is the least or in other words in order to obtain a line which fits the points best ∑ ( Y - Y c)2 should be minimum. Such a line is known as the line of best fit.
A straight line fitted by least squares has the followings characteristics;
a.It gives the best fit to data in the since that it make the sum of the squared deviations from the line ( Y- Y c)2 smaller than they would be from any other straight line. This property accounts for the name least squares.
b.The deviation above the line equal those below the line on the average. This mean that the total of the positive deviations is zero or ∑( Y-Yc)= 0
c. The straight line goes through the overall mean of the data( S Y).
d. When the data represent a sample from a large population the least squares lien is a best estimate of the population regression line.
With a little algebra and differential calculus it can be shown that the followings two equations if solved simultaneously will yield values of the parameters a and b such that the least squares requirement is fulfilled:
∑Y = Na + b ∑X
∑XY = a ∑X + b ∑X2
These equations are usually called the normal equations. In the equations ∑X ∑XY, ∑X22, indicate totals which are computed from the observed pairs of values of two variables X and y to which the least squares estimating.