Documentation

Interactive Fitting

The Basic Fitting UI

The MATLAB®Basic Fitting UI allows you to interactively:

  • Model data using a spline interpolant, a shape-preserving interpolant, or a polynomial up to the tenth degree

  • Plot one or more fits together with data

  • Plot the residuals of the fits

  • Compute model coefficients

  • Compute the norm of the residuals (a statistic you can use to analyze how well a model fits your data)

  • Use the model to interpolate or extrapolate outside of the data

  • Save coefficients and computed values to the MATLAB workspace for use outside of the dialog box

  • Generate MATLAB code to recompute fits and reproduce plots with new data

Note

The Basic Fitting UI is only available for 2-D plots. For more advanced fitting and regression analysis, see the Curve Fitting Toolbox™ documentation and the Statistics and Machine Learning Toolbox™ documentation.

Preparing for Basic Fitting

The Basic Fitting UI sorts your data in ascending order before fitting. If your data set is large and the values are not sorted in ascending order, it will take longer for the Basic Fitting UI to preprocess your data before fitting.

你可以加快基本拟合界面首先法师ting your data. To create sorted vectorsx_sortedandy_sortedfrom data vectorsxandy, use the MATLABsortfunction:

[x_sorted, i] = sort(x); y_sorted = y(i);

Opening the Basic Fitting UI

To use the Basic Fitting UI, you must first plot your data in a figure window, using any MATLAB plotting command that produces (only)xandydata.

To open the Basic Fitting UI, selectTools > Basic Fittingfrom the menus at the top of the figure window.

When you fully expand it by twice clicking the arrow buttonin the lower right corner, the window displays three panels. Use these panels to:

  • Select a model and plotting options

  • Examine and export model coefficients and norms of residuals

  • Examine and export interpolated and extrapolated values.

To expand or collapse panels one-by-one, click the arrow button in the lower right corner of the interface.

Example: Using Basic Fitting UI

This example shows how to use the Basic Fitting UI to fit, visualize, analyze, save, and generate code for polynomial regressions.

Load and Plot Census Data

The file,census.mat, contains U.S. population data for the years 1790 through 1990 at 10 year intervals.

To load and plot the data, type the following commands at the MATLAB prompt:

load census plot(cdate,pop,'ro')

Theloadcommand adds the following variables to the MATLAB workspace:

  • cdate— A column vector containing the years from 1790 to 1990 in increments of 10. It is the predictor variable.

  • pop— A column vector with U.S. population for each year incdate.It is the response variable.

The data vectors are sorted in ascending order, by year. The plot shows the population as a function of year.

Now you are ready to fit an equation the data to model population growth over time.

Predict the Census Data with a Cubic Polynomial Fit

  1. Open the Basic Fitting dialog box by selectingTools > Basic Fittingin the Figure window.

  2. In thePlot fitsarea of the Basic Fitting dialog box, select thecubiccheck box to fit a cubic polynomial to the data.

    MATLAB uses your selection to fit the data, and adds the cubic regression line to the graph as follows.

    In computing the fit, MATLAB encounters problems and issues the following warning:

    Polynomial is badly conditioned. Add points with distinct X values, select a polynomial with a lower degree, or select "Center and scale X data."

    This warning indicates that the computed coefficients for the model are sensitive to random errors in the response (the measured population). It also suggests some things you can do to get a better fit.

  3. Continue to use a cubic fit. As you cannot add new observations to the census data, improve the fit by transforming the values you have toz-scoresbefore recomputing a fit. Select theCenter and scale X datacheck box in the dialog box to make the Basic Fitting tool perform the transformation.

    To learn how centering and scaling data works, seeLearn How the Basic Fitting Tool Computes Fits

  4. Now view the equations and display residuals. In addition to selecting theCenter and scale X dataandcubiccheck boxes, select the following options:

    • Show equations

    • Plot residuals

    • 显示标准残差

SelectingPlot residualscreates a subplot of them as a bar graph. The following figure displays the results of the Basic Fitting UI options you selected.

The cubic fit is a poor predictor before the year 1790, where it indicates a decreasing population. The model seems to approximate the data reasonably well after 1790. However, a pattern in the residuals shows that the model does not meet the assumption of normal error, which is a basis for the least-squares fitting. Thedata 1line identified in the legend are the observedx(cdate) andy(pop) data values. Thecubicregression line presents the fit after centering and scaling data values. Notice that the figure shows the original data units, even though the tool computes the fit using transformed z-scores.

For comparison, try fitting another polynomial equation to the census data by selecting it in thePlot fitsarea.

Tip

You can change the default plot settings and rename data series. For more information, seeCustomize Graph Using Plot Tools

View and Save the Cubic Fit Parameters

In the Basic Fitting dialog box, click the arrow buttonto display the estimated coefficients and the norm of the residuals in theNumerical resultspanel.

To view a specific fit, select it from theFitlist. This displays the coefficients in the Basic Fitting dialog box, but does not plot the fit in the figure window.

Note

If you also want to display a fit on the plot, you must select the correspondingPlot fitscheck box.

Save the fit data to the MATLAB workspace by clicking theSave to workspacebutton on the Numerical results panel. The Save Fit to Workspace dialog box opens.

With all check boxes selected, clickOKto save the fit parameters as a MATLAB structure:

fit fit = type: 'polynomial degree 3' coeff: [0.9210 25.1834 73.8598 61.7444]

Now, you can use the fit results in MATLAB programming, outside of the Basic Fitting UI.

Derive R2, the Coefficient of Determination

You can get an indication of how well a polynomial regression predicts your observed data by computing thecoefficient of determination,orR-square(written as R2). The R2statistic, which ranges from 0 to 1, measures how useful the independent variable is in predicting values of the dependent variable:

  • An R2value near 0 indicates that the fit is not much better than the modely = constant

  • An R2value near 1 indicates that the independent variable explains most of the variability in the dependent variable.

To compute R2, first compute a fit, and then obtainresidualsfrom it. A residual is the signed difference between an observed dependent value and the value your fit predicts for it.

residuals = yobserved- yfitted

The Basic Fitting tool can generate residuals for any fit it calculates. To view a graph of residuals, select thePlot residualscheck box. You can view residuals as a bar, line or scatter plot.

After you have residual values, you can save them to the workspace, where you can compute R2.完整的the preceding part of this example to fit a cubic polynomial to the census data, and then perform these steps:

Compute Residual Data and R2for a Cubic Fit

  1. Click the arrow buttonat the lower right to open the Numerical results tab if it is not already visible.

  2. From theFitdrop-down menu, selectcubicif it does not already show.

  3. Save the fit coefficients, norm of residuals, and residuals by clickingSave to Workspace

    The Save Fit to Workspace dialog box opens with three check boxes and three text fields.

  4. Select all three check boxes to save the fit coefficients, norm of residuals, and residual values.

  5. Identify the saved variables as belonging to a cubic fit. Change the variable names by adding a3to each default name (for example,fit3,normresid3, andresids3). The dialog box should look like this figure.

  6. ClickOK.Basic Fitting saves residuals as a column vector of numbers, fit coefficients as a struct, and the norm of residuals as a scalar.

    Notice that the value that Basic Fitting computes for norm of residuals is12.2380.This number is the square root of the sum of squared residuals of the cubic fit.

  7. Optionally, you can verify the norm-of-residuals value that the Basic Fitting tool provided. Compute the norm-of-residuals yourself from theresids3array that you just saved:

    mynormresid3 = sum(resids3.^2)^(1/2) mynormresid3 = 12.2380

  8. Compute thetotal sum of squaresof the dependent variable,popto compute R2.Total sum of squares is the sum of the squared differences of each value from the mean of the variable. For example, use this code:

    SSpop = (length(pop)-1) * var(pop) SSpop = 1.2356e+005
    var(pop)computes the variance of the population vector. You multiply it by the number of observations after subtracting 1 to account for degrees of freedom. Both the total sum of squares and the norm of residuals are positive scalars.

  9. Now, compute R2, using the square ofnormresid3andSSpop:

    rsqcubic = 1 - normresid3^2 / SSpop rsqcubic = 0.9988

  10. Finally, compute R2for a linear fit and compare it with the cubic R2value that you just derived. The Basic Fitting UI also provides you with the linear fit results. To obtain the linear results, repeat steps 2-6, modifying your actions as follows:

    • To calculate least-squares linear regression coefficients and statistics, in theFitdrop-down on the Numerical results pane, selectlinearinstead ofcubic

    • In the Save to Workspace dialog, append1to each variable name to identify it as deriving from a linear fit, and clickOK.The variablesfit1,normresid1, andresids1now exist in the workspace.

    • Use the variablenormresid1(98.778) to compute R2for the linear fit, as you did in step 9 for the cubic fit:

      rsqlinear = 1 - normresid1^2 / SSpop rsqlinear = 0.9210

    This result indicates that a linear least-squares fit of the population data explains 92.1% of its variance. As the cubic fit of this data explains 99.9% of that variance, the latter seems to be a better predictor. However, because a cubic fit predicts using three variables (x,x2, andx3), a basic R2value does not fully reflect how robust the fit is. A more appropriate measure for evaluating the goodness of multivariate fits isadjusted R2.For information about computing and using adjusted R2, seeResiduals and Goodness of Fit

Caution

R2measures how well your polynomial equationpredictsthe dependent variable, not howappropriatethe polynomial model is for your data. When you analyze inherently unpredictable data, a small value of R2indicates that the independent variable does not predict the dependent variable precisely. However, it does not necessarily mean that there is something wrong with the fit.

Compute Residual Data and R2for a Linear Fit.In this next example, use the Basic Fitting UI to perform a linear fit, save the results to the workspace, and compute R2for the linear fit. You can then compare linear R2with the cubic R2value that you derive in the exampleCompute Residual Data and R2 for a Cubic Fit

  1. Click the arrow buttonat the lower right to open the Numerical results tab if it is not already visible.

  2. Select thelinearcheck box in thePlot fitsarea.

  3. From theFitdrop-down menu, selectlinearif it does not already show. The Coefficients and norm of residuals area displays statistics for the linear fit.

  4. Save the fit coefficients, norm of residuals, and residuals by clickingSave to Workspace

    The Save Fit to Workspace dialog box opens with three check boxes and three text fields.

  5. Select all three check boxes to save the fit coefficients, norm of residuals, and residual values.

  6. Identify the saved variables as belonging to a linear fit. Change the variable names by adding a1to each default name (for example,fit1,normresid1, andresids1).

  7. ClickOK.Basic Fitting saves residuals as a column vector of numbers, fit coefficients as a struct, and the norm of residuals as a scalar.

    Notice that the value that Basic Fitting computes for norm of residuals is98.778.This number is the square root of the sum of squared residuals of the linear fit.

  8. Optionally, you can verify the norm-of-residuals value that the Basic Fitting tool provided. Compute the norm-of-residuals yourself from theresids1array that you just saved:

    mynormresid1 = sum(resids1.^2)^(1/2) mynormresid1 = 98.7783

  9. Compute thetotal sum of squaresof the dependent variable,popto compute R2.Total sum of squares is the sum of the squared differences of each value from the mean of the variable. For example, use this code:

    SSpop = (length(pop)-1) * var(pop) SSpop = 1.2356e+005
    var(pop)computes the variance of the population vector. You multiply it by the number of observations after subtracting 1 to account for degrees of freedom. Both the total sum of squares and the norm of residuals are positive scalars.

  10. Now, compute R2, using the square ofnormresid1andSSpop:

    rsqlinear = 1 - normresid1^2 / SSpop rsqcubic = 0.9210

    This result indicates that a linear least-squares fit of the population data explains 92.1% of its variance. As the cubic fit of this data explains 99.9% of that variance, the latter seems to be a better predictor. However, a cubic fit has four coefficients (x,x2,x3, and a constant), while a linear fit has two coefficients (xand a constant). A simple R2statistic does not account for the different degrees of freedom. A more appropriate measure for evaluating polynomial fits isadjusted R2.For information about computing and using adjusted R2, seeResiduals and Goodness of Fit

Caution

R2measures how well your polynomial equationpredictsthe dependent variable, not howappropriatethe polynomial model is for your data. When you analyze inherently unpredictable data, a small value of R2indicates that the independent variable does not predict the dependent variable precisely. However, it does not necessarily mean that there is something wrong with the fit.

Interpolate and Extrapolate Population Values

Suppose you want to use the cubic model to interpolate the U.S. population in 1965 (a date not provided in the original data).

  1. In the Basic Fitting dialog box, click thebutton to specify a vector ofxvalues at which to evaluate the current fit.

  2. In theEnter value(s)...field, type the following value:

    1965

    Note

    Use unscaled and uncenteredxvalues. You do not need to center and scale first, even though you selected to scalexvalues to obtain the coefficients inPredict the Census Data with a Cubic Polynomial Fit.The Basic Fitting tool makes the necessary adjustments behind the scenes.

  3. ClickEvaluate

    Thexvalues and the corresponding values forf(x)computed from the fit and displayed in a table, as shown below:

  4. Select thePlot evaluated resultscheck box to display the interpolated value as a diamond marker:

  5. Save the interpolated population in 1965 to the MATLAB workspace by clickingSave to workspace

    这将打开对话框后,specify the variable names:

  6. ClickOK, but keep the Figure window open if you intend to follow the steps in the next section,Generate a Code File to Reproduce the Result

Generate a Code File to Reproduce the Result

After completing a Basic Fitting session, you can generate MATLAB code that recomputes fits and reproduces plots with new data.

  1. In the Figure window, select文件>生成代码

    This creates a function and displays it in the MATLAB Editor. The code shows you how to programmatically reproduce what you did interactively with the Basic Fitting dialog box.

  2. Change the name of the function on the first line fromcreatefigureto something more specific, likecensusplot.Save the code file to your current folder with the file namecensusplot.mThe function begins with:

    function censusplot(X1, Y1, valuesToEvaluate1)

  3. Generate some new, randomly perturbed census data:

    randpop = pop + 10*randn(size(pop));
  4. Reproduce the plot with the new data and recompute the fit:

    censusplot(cdate,randpop,1965)

    You need three input arguments:x,yvalues (data 1) plotted in the original graph, plus anx-value for a marker.

    The following figure displays the plot that the generated code produces. The new plot matches the appearance of the figure from which you generated code except for theydata values, the equation for the cubic fit, and the residual values in the bar graph, as expected.

Learn How the Basic Fitting Tool Computes Fits

The Basic Fitting tool calls thepolyfitfunction to compute polynomial fits. It calls thepolyvalfunction to evaluate the fits.polyfitanalyzes its inputs to determine if the data is well conditioned for the requested degree of fit.

When it finds badly conditioned data,polyfitcomputes a regression as well as it can, but it also returns a warning that the fit could be improved. The Basic Fitting example sectionPredict the Census Data with a Cubic Polynomial Fitdisplays this warning.

One way to improve model reliability is to add data points. However, adding observations to a data set is not always feasible. An alternative strategy is to transform the predictor variable to normalize its center and scale. (In the example, the predictor is the vector of census dates.)

Thepolyfitfunction normalizes by computingz-scores:

z = x μ σ

wherexis the predictor data,μis the mean ofx, andσis the standard deviation ofx.Thez-scores give the data a mean of 0 and a standard deviation of 1. In the Basic Fitting UI, you transform the predictor data toz-scores by selecting theCenter and scale x datacheck box.

After centering and scaling, model coefficients are computed for theydata as a function ofz.These are different (and more robust) than the coefficients computed foryas a function ofx.The form of the model and the norm of the residuals do not change. The Basic Fitting UI automatically rescales thez-scores so that the fit plots on the same scale as the originalxdata.

To understand the way in which the centered and scaled data is used as an intermediary to create the final plot, run the following code in the Command Window:

close load census x = cdate; y = pop; z = (x-mean(x))/std(x); % Compute z-scores of x data plot(x,y,'ro') % Plot data as red markers hold on % Prepare axes to accept new graph on top zfit = linspace(z(1),z(end),100); pz = polyfit(z,y,3); % Compute conditioned fit yfit = polyval(pz,zfit); xfit = linspace(x(1),x(end),100); plot(xfit,yfit,'b-') % Plot conditioned fit vs. x data

The centered and scaled cubic polynomial plots as a blue line, as shown here:

In the code, computation ofzillustrates how to normalize data. Thepolyfitfunction performs the transformation itself if you provide three return arguments when calling it:

[p,S,mu] = polyfit(x,y,n)
The returned regression parameters,p, now are based on normalizedx.The returned vector,mu, contains the mean and standard deviation ofx.For more information, see thepolyfitreference page.

Was this topic helpful?