Name(s):                                                                                            :
Math 203 B Linear Regression Activity
Tom Linton, Feb 9, 2000


  1. Select one member of your group (play paper-scissors-rock if needed) and use the ruler at the bottom of this page to make two measurements for the selected individual. First, measure the length of their index finger. On the palm side of their hand, measure from the tip of the finger to the "line" where the index finger joins the palm of their hand. Second, measure the width of all 4 fingers together. Hold your fingers together and measure (roughly across the middle knuckles) from the outside edge of your pinkie to the thumb side edge of your index finger. Record these measurements below and add you data to the class data set on the board.
  2. Record the class data in the table below.
Single Person Data
X
                         
Y
                         

 
 
  1. Make a good scatter plot of the class data below. Include scales and labels on both axes.

  2.  

     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     


     
     
     

  3. On your scatterplot above, draw in what seems to be the line which best fits the data (comes closest to going through all the data points). Select two points (preferably the points should be far apart and the points do NOT need to be data points, any points on the line can be selected) on your line (record the coordinates below) and calculate the slope (change in Y divided by change in X) for these two points.

  4.  

     

    Point 1: X =             Y =
     

    Point 2: X =              Y =
     
     

    Slope =
     
     

  5. If you know the slope, say b, of a line, and one point, say (c,d), on the line, an equation for the line is y = b(x - c) + d. For example if a line has slope 5 and goes through the point (3, -2), then an equation is y = 5(x - 3) + 2, or, after simplifying, y = 5x - 13. Find an equation for the line you drew above.

  6.  

     
     
     
     
     
     
     
     
     
     
     

  7. Now, we'll use the calculator to find the "least squares regression line" for this data. This line is (in some sense) the line that "best fits" the data. Enter the X data (length of index finger) in L1 and the Y data (width of four fingers) in L2. Press [2nd][QUIT] to return to the home-screen. Press [STAT] [right-arrow], to select the [CALC] submenu of the statistics menu and then press [8] to select the LinReg(a + bx) command. This will paste a command to your home-screen (do not press [ENTER] yet). By default, the TI-83 assumes that the x-values are in L1 and the y-values are in L2. Since this is true in our case, we don't need to give the LinReg(a + bx) command any list names this time. If our x-data is in the list XList, and the y-data is in the list YList, the command
LinReg(a + bx) XList, YList [ENTER]
    will find the least squares regression equation for your data. Thus, if you have x-values in L2 and y-values in L4, use LinReg(a + bx) L2, L4. Normally, you calculate a linear regression in situations where you want to use the linear equation (for predictions etc.) after you find the formula. It is therefore handy to be able to save the equation in a Y-variable. To do this, press [VARS] [right-arrow], to select the Y-VARS menu, then [ENTER] (to select Function) and [ENTER] again to select to Y1. If your home-screen looks like the one below, press [ENTER] to run the linear regression.
    The command displayed above asks the TI-83 to find the linear regression equation for x data in L1, y data in L2 (the default locations) and store the equation in Y1. You can store the equation in a different Y-variable by changing the Y1 above to something like Y2 or Y3. Record the linear equation that the TI-83 produces (this equation is also stored in Y1, but you should be able to write down the equation from the information displayed on the home-screen).
     
     
     
     
     
     
     
     
     
     
     
  1. How close or different are the equations of the line you drew above, and the one reported by the TI-83? Compare both the slope (b) and the y-intercept (a) values.

  2.  

     
     
     
     
     
     
     
     
     
     
     
     
     

  3. The book claims that a linear regression equation will go through the point (x-bar, y-bar). Run the 2-Var-Stats command on L1 and L2 and record the values reported for x-bar and y-bar.

  4.  

     
     
     
     
     
     
     
     
     
     
     
     

  5. Since the linear regression equation is stored in Y1, we can see whether or not the calculator's line goes through the point  (x-bar, y-bar) by simply asking the calculator to find the value Y1(x-bar) and seeing if it is equal to y-bar. To evaluate Y1 at a point, say X = 4.32, we can simply get the calculator to "type out" Y1(4.32) and press [ENTER]. Press [VARS][right-arrow][ENTER][ENTER] (which should print-out Y1 on your home-screen), then a left parenthesis [ ( ], the value you found for x-bar above, and finally the right parenthesis [ ) ] and press [ENTER]. Record the calculator's value for Y1(x-bar):

  6.  

     
     
     
     
     
     
     
     

  7. To see a scatterplot of the actual data along with a plot of the calculator's best fit line, setup stat-plot 1 as shown below and then press [Zoom][9] (for zoomstat). Because we stored the equation for the least squares regression line in Y1, the calculator will graph both the data and the line!
Look for one point that is most likely to be an influential point (either an outlier that is far from the line, or a data point with a big or small x-value and an atypical y-value). Record this point's coordinates below; go back to your statistical editor and delete this point (both the x and y values). Re-run the linear regression on the data with the deleted point (this time store the equation in Y2 instead of Y1) and record the equation of this new least squares regression line below.

potential influential data point: X =         Y =
 

New linear regression equation:
 
 
 
 

    If the slope and intercept remain "about the same", the point is NOT influential. If either the slope or the y-intercept change by a fair amount, the point is influential. Do you think the point you selected is influential or not?