4.2.2.23 Distribution Fit with the Probability Density Function and Cumulative Distribution Function


Summary

To know the location or scale parameters of a sample distribution, one can perform distribution fit on the data. However, you can also fit a probability density function or cumulative distribution function on the binned data to get these values. This tutorial shows you how to estimate these parameters by curve fitting.

Minimum Origin Version Required: Origin 2016 SR2

What you will learn

  • Generating Normally Distributed Data for Fitting
  • Fitting with Probability Density Function (PDF)
  • Fitting with Cumulative Distribution Function (CDF)

Example and Steps

Generating Normally Distributed Data for Fitting

  1. Run the following script to create sample dataset
    newbook;
    col(2) = normal(1000) * 2 + 5;
    This script generates 1000 normally distributed points where mean ≈ 5 and σ ≈ 2.
  2. We can first perform simple descriptive statistics on this column to see the corresponding Moments output.

    Highlight the data column and select Statistics: Descriptive Statistics: Statistics on Column to open the dialog. Select Quantities tab, then make sure the Mean and Standard Deviation checkboxes are selected. And the click OK to generate report.

    Fitting on Frequency Count Stats Result.png
    From the report worksheet, we can see the Mean and Standard Deviation are very close to the value we just set.

Fitting with Probability Density Function (PDF)

  1. To fit the data with the PDF, we should calculate the binned data with Frequency Counts tool first. Highlight the source data column and select Statistics: Descriptive Statistics: Frequency Counts from menu. This dialog will count the number of data points on specified bins.
    • Expand the Computation Control branch, and make sure Bin Size radio button is selected beside Step by. Set the Bin Size to 0.5 (you'll have to clear the Auto check box).
    • Make sure the Bin Center, Count and Cumulative Count check boxes under Quantities to Compute branch are selected. Then click OK to count the data.
  2. Highlight the Count column on the Frequency Counts result worksheet, select Plot>2D: Bar: Column to create a column graph. Now we have the histogram of the source data.
  3. With the graph active, select menu Analysis: Fitting: Nonlinear Curve menu to open the NLFit dialog. Then select the Gauss function from the Statistics category. Leave other options as defaults and click the Fit button directly to output fitting report.
    Fitting on Frequency Count Fit Result.png

    From the fitting report, we can see that the fitted xc and sigma are close to 5 and 2.

Fitting with Cumulative Distribution Function (CDF)

  1. To fit the data with the CDF, we should start from the cumulative binned data. Select the FreqCounts1 sheet from the previous section. Highlight the Cumulative Count column. Select menu Plot>2D: Scatter: Scatter to plot the CDF points.
  2. With the graph active, select menu Analysis: Fitting: Nonlinear Curve Fit to open the NLFit dialog. Select the NormalCDF function from the Statistics category. Leave other options as defaults and click the Fit button directly to output fitting report.
    Fitting on Frequency Count Fit Result 2.png

    From the fitting report, we can see that the fitted xc and w are close to 5 and 2.

After you got binned data from the Frequency Counts tool, you can also fit the binned data with your user-defined probability density function or cumulative distribution function. View this page for defining and fitting with a user-defined function.