2.13.1.6 diststats

Additional Information

Minimum Origin Version Required: 8.0 SR6

Command Line Usage

diststats iy:=col(3) percent:=col(4) quantile:=col(5);

X-Function Execution Options

Please refer to the page for additional option switches when accessing the x-function from script

Variables

Display
Name
Variable
Name
I/O
and
Type
Default
Value
Description
Input iy

Input

XYRange

<active>
Specify the input data.
Percent of Integral Area percent

Input

vector

<unassigned>
This variable is related to quantile. Assign a vector to this variable. Then for each value k in the vector, the kth percentile will be output.
Peak Direction dir

Input

int

1
This X-Function assumes that there is a peak in the input data. It will locate it. The user is required to specify the direction of the peak.

Option list:

  • neg:negative
    This means that the peak is a negative one. When this option is chosen, the data point that has the minimum Y value will be regarded as the peak.
  • pos:Positive
    This means that the peak is a positive one. When this option is chosen, the data point that has the maximum Y value will be regarded as the peak.
Index of Peak ipeak

Output

int

<unassigned>
Specify the output of the index of the data point which is the peak in the input data.
X Peak xpeak

Output

double

<unassigned>
Specify the output of the X value of the data point which is the peak in the input data.
Y Peak ypeak

Output

double

<unassigned>
Specify the output of the Y value of the data point which is the peak in the input data.
Mean mean

Output

double

<unassigned>
Specify the output of the mean of the input dataset. See its computation in the Algorithm part.
Median median

Output

double

<unassigned>
Specify the output of the mean of the input dataset. See its computation in the Algorithm part.
Quantile quantile

Output

vector

<unassigned>
Specify the output of the quantiles correspond to the values in the vector that is assigned to the percent variable.

Description

This function can be used to execute distribution statistics on an XY range. The input data's X values should be monotonically increasing and the Y values should be greater than 0. Otherwise, the X-Function cannot be used.

Example

fname$ = system.path.program$ + "Samples\Curve Fitting\Gaussian.dat";
newbook;
impasc;
dataset aa={0,0.01,25,50,99.99,100};
col(4)=aa;
diststats iy:=col(2) percent:=aa quantile:=col(5);

Algorithms

Mean

The weighted arithmetic mean is calculated:  mean = \frac{\sum_{i=1}^n{y_i \cdot x_i}}{\sum_{i=1}^n {y_i}} where x_i and y_i is the X and Y values of the ith data point in the input dataset.

Median

First the sum of Y value of all data points is calculated:

S=\sum_i^ny_i

We suppose there is an M=0.5*S, for your data, the formula below is always coming into existence:

\sum_i^ky_i \le M<\sum_{i=1}^{k+1}y_i

And the median we want is Median=X[K].

Quantile

Integration is performed on the whole input data to calculate the absolute area. Then for each value k in the vector assigned to the percent variable, the function finds a value m so that if one integrates the input curve from the first X value to m, the absolute area of this section will be of k percent of the total area.

Related X-Functions

stats