Let me make your life easier before I go into all the mumbo jumbo. Go to Excel or any other tabular format and do this:
1. Plug in the observed X values into the formula:
.1827(height)+12.4932
And whatever value you get is the Predicted Y.
2. Place those values in a separate column.
3. The difference between the Actual Y (your Head Circumference values) and that column (the Predicted Head Circumference) is the residual.
4. In a separate column, square the residual.
5. Finally, add up the residuals. This is yur SUM OF SQUARED ERRORS. The linear regression minimizes the sum of squared errors- this value- hence why it is called least squares regression.
You asked about m = r (sy/sx)
r here is the CORRELATION BETWEEN X AND Y!
sy and sx are the STANDARD DEVIATIONS of X AND Y.
The Standard DEVIATION is simply:
Square Root of [SUM of [(The Actual Value-Mean Value)^2/N-1)]]
where N is the number of observations you have, I think 11.
So you sum up the squared deviations from the mean, divide by N-1, and then take the square root. That is the Standard Deviation. Do this for both Y and X.
Step by step:
1. Find the Mean Y and Mean X (simply add up all X's or Y's, and divide by the number you have, to get the arithmetic mean).
2. For each Y or X, find the difference between that value and the mean you just calculated.
3. In a separate column, input the squared difference (answer to step 2 squared).
4. Add up the entire clumn of squared differences and divide by N-1, where N= number of observations you have, 11.
5. Take the square root of that. Do this for both the X and Y vaues. These represent the sdy(standard deviation y) and sdx(standard deviationx).
6. Do a correlation procedure, which I think Minitab should have as part of its features.
7. Multiply the correlation by the value (sy/sx), which you have from what you calculated.
8. This is your Beta coefficient, which means how much Y is expected to change for each 1 unit change in X.
9. The Intercept tells you where the regression line crosses the Y axis (that is, when x=0). The Intercept is 12.4932.
How do you manually calculate the intercept? You plug in the Mean X as your X and multiply by the Beta Coefficient you just calculated, and set it so that it equals:
Mean Y=Intercept +Beta Coefficient*(Mean X)
Then simply do:
Y(mean)-BetaCoefficient*X(mean)
=Intercept.
Voila, you have your regression formula!
Keep in mind these procedures only apply to the
simple regression, where you haveonly one x. When
you have more than one x, you cannot use the
formula for the Beta coefficient from above.
All the data you already have: you know the Y and X mean values, and you just calculated the Beta Coefficient in Step 7.
This is all you need!
More Mumbo Jumbo (Optional)
The simple regression model is a regression of head circumference (y) on height (x). What this really means is that we want to find out what effect, if any, height has on head circumference.
The "m" here is like the slope. As you may know, the slope defines the "rise over run". In regression models, the m=beta coefficient, and it states that for a 1 unit change in x, there will be a certain "beta" change in Y.
Do you have Excel? I am not too familiar with Minitab, but Excel does have a regression option. In some regressions we assume that the Y intercept is 0, but in this case we don't make that assumption. Simply place all values for x and y in columns, and do a regression of the y on x.
The regression line here will be a line that minimizes the squared deviations of the predicted y values from the observed y vaues(that is, it minimizes the squared errors, "sum of e^2". The computer program will create its own predicted Y vaues: you don't have to do that part yourself. The entire purpose of the model is to get the predicted Y values as close to the actual observed values, hence "least squares regression". Again, the computer does this part for you (it uses some tricky matrix algebra which most people would rack their brains out doing by hand).
So, for example, we model Y on X and get:
Y-actual=12.4932(in regressions we tend to place the Y intercept first)+.1827x+e
Ok so that is the Y-actual, what we really have. Notice that above I added an "e" term, which is the error. That is because the regression almost never completely predicts the Y value exactly as the observed value is. Does this make sense? If the line fr the regression passed through each and every observed y value (the values you actually measure), there would be no error.
error=0.
Because:
[Y-predicted=Intercept+BetaX]
-[Y-observed=Intercept+BetaX+error}
=error
But in reality the error is greater than 0. Your residual is just the difference between the predicted Y and the Observed Y. How do you get the residual? Very EASY. Just plug in the X's for the model, and get a Predicted Y. The differnece between Predicted and Actual is the residual (it's the error for each observation).