banner

Fitting a Line to (Almost) Linear Data

Given a list of pairs of real numbers (data points), we often want to see if the data points are close to lying on a line, and find a line that is a good "fit" for the points.

(1)  First carefully plot the points, on graph paper, or use a graphing calculator or a computer program with graphics.

Graphing a set of data points on the TI-83:

Press STAT then ENTER.  You should see three columns, headed by the names L1, L2, L3.  If your columns do not display these lists, press STAT and then 5:SetUpEditor and ENTER, to load the default lists L1, L2, etc. into the statistical editor, and finally press STAT then ENTER to access the statistical editor. If the columns have numbers in them, you want to clear them.  To clear column 1, use the up cursor to highlight L1, press CLEAR, then ENTER.  Similarly clear L2 and L3.

To enter data into L1 (these numbers should be the x-values of the points you wish to plot), place the cursor on the first position in the column (see the left screen below), type in the first number, press ENTER, which moves the cursor to the next position in that column.  Enter the next number, press ENTER, and continue the process until you have all the values entered in column L1.  Then use the right cursor to go to column L2 and enter the y-values in this column.  The x and y values of each pair of data points should be side by side in these two columns.  Also, you must have the same number of entries in each column.  Once the data points have been entered, you are ready to plot them with a scatter plot.

stat editor screen shotstat plot screen shot

Displaying a Scatter Plot:

Press  2nd then  Y= (this is STAT PLOT).  Press ENTER.  On the screen that appears, you navigate with the arrow or cursor keys (left, right, up and down) and make selections by pressing ENTER. Select PLOT 1 (if it is not already highlighted), then highlight On and Press ENTER.  The next line, Type:, should have the picture on the upper left highlighted (which represents a scatterplot, see the right screen above), the Xlist should be L1 and the Ylist should be L2. The Mark can be the leftmost (small box) or middle (+) selections.  If these are already at these settings, do nothing; otherwise, highlight what needs to be changed and press ENTER for each change.

Before plotting the data points, do two things:

  1. Press Y=  and either clear or turn off any functions listed in that window.  (To turn off a function, use the cursor to highlight the = sign and press ENTER.)
  2. Set the window for your plot of data points based on the values in the lists L1 and L2.  Press WINDOW, and set Xmin and Xmax so that Xmin is less than your smallest x-value and Xmax is larger than your largest x-value (these are the values in L1) and similarly set Ymin and Ymax based on your y-values (the values in L2). Usually pressing ZOOM then 9:ZoomStat will pick a good window as well.
To graph the scatterplot, press GRAPH.

(2)  Observe the scatterplot.  Do the points appear to lie almost on a line?
If so, then we want to find a line that is a good "fit." There are many ways to do this.

If your points are on graph paper, you can just use a clear plastic straightedge and manipulate it until it seems to best fit the points, then draw the line.  Once the line is drawn, you can calculate the slope of the line, and find a point on the line, and then give the equation of the line using the point-slope formula (slope = m, (a, b) on line, equation is y = m(x - a) + b). You can also use the calculator to find the "least-squares" regression line.
 

(3)  Regression line (least squares fit).

Statisticians often use a standard method to find a "fit" line, called a least squares fit (the "fit" line is called a regression line).  This method finds the line that minimizes the sum of the areas of the squares that have their vertical edges drawn from a data point to the line. The TI-83 can calculate the equation of this regression line with a few keystrokes. The animation below depicts several possible regression lines. It displays the "squared errors" for each data point (green boxes) as well as the total squared error (yellow box). A least squares regression on the TI-83 will select the line which makes the area of the yellow square as  small as possible.

square errors

Linear Regression on the TI-83:

Press STAT, and use the cursor to highlight CALC, then press 8:LinReg(a+bx) then tell the calculator to store the resulting regression equation in Y1 by pressing VARS, right-arrow (to select the Y-VARS menu), ENTER (for FUNCTION), then [ENTER] (for Y1) and finally press ENTER (your screens may look different and the last two lines on the right screen below may not appear).
output

The screen displays the slope b and the y-intercept a of the regression line, and this equation is automatically entered as Y1 on the Y= screen.  You can press GRAPH to see how well it fits the data (assuming STATPLOT 1 is still displayed and you have a good window chosen). You can access the value of the correlation, r, by pressing [VARS] [5] (to access the statistical variables) then scroll to the [EQ] menu (press the right arrow key twice) and finally press [7:r] and [ENTER]. If you'd like your calculator to always display the correlation r (and also r2) whenever it does a linear regression (like the right screen above), press [2nd][CATALOG], then [D]  to jump to the commands which begin with the letter D, scroll down a few lines to the command DiagnosticsOn and press [ENTER] (twice). Now r will automatically be displayed whenever you execute a linear regression command. The residuals of the least squares regression line (the values found by subtracting the true y-values from the regression calculated values) will be stored in a list named RESID (or perhaps LRESID), which can be accessed by pressing [2nd][LIST] and then scrolling down to the line with RESID on it and pressing [ENTER]. You can see a residual plot by setting stat-plot 2 as a scatterplot, with XList = L1 and YList = RESID.

A Note on Restoring Your Calculator:

Once you have finished your work with scatter plots and do not want them to appear on the screen when you are graphing other functions, you should turn off the scatterplot (and you also may wish to clear the lists of data).  To turn off the scatterplot, press Y=, and use the up cursor to highlight Plot 1, then press ENTER.  If Plot2 or Plot3 are highlighted, turn them off in the same way.  The directions to clear the data from the lists L1 and L2 are given under (1) above.