1.12.2 Statistics

Often we want to do statistics on the selected data in a worksheet, i.e. one column, one row, or an entire worksheet. The Working with Data: Numeric Data: DataRange chapter shows how to construct a data range object by column/row index, then get the raw data into a vector.

Descriptive Statistics on Columns and Rows

The ocmath_basic_summary_stats function is used to compute basic descriptive statistics, such as total number, mean, standard deviation, and skewness, for raw data. For more details, refer to Origin C help. The following Origin C code calculates and outputs the number of points, the mean, and the standard error of mean on the data in the vector object named vData.

int N;
double Mean, SE;
ocmath_basic_summary_stats(vData.GetSize(), vData, &N, &Mean, NULL, &SE);
printf("N=%d\nMean=%g\nSE=%g\n", N, Mean, SE);

Frequency Count

The ocmath_frequency_count function is used to calculate the frequency count, according to the options in the FreqCountOptions structure.

// Source data to do frequency count
vector vData = {0.11, 0.39, 0.43, 0.54, 0.68, 0.71, 0.86};

// Set options, including bin size, from, to and border settings.
int nBinSize = 5;	
FreqCountOptions fcoOptions;    
fcoOptions.FromMin = 0;
fcoOptions.ToMax = 1;
fcoOptions.StepSize = nBinSize;
fcoOptions.IncludeLTMin = 0;
fcoOptions.IncludeGEMax = 0;

vector vBinCenters(nBinSize);
vector vAbsoluteCounts(nBinSize);
vector vCumulativeCounts(nBinSize);
int nOption = FC_NUMINTERVALS; // to extend last bin if not a full bin

int nRet = ocmath_frequency_count(
    vData, vData.GetSize(), &fcoOptions,
    vBinCenters, nBinSize, vAbsoluteCounts, nBinSize,
    vCumulativeCounts, nBinSize, nOption);

if( STATS_NO_ERROR == nRet )
    out_str("Done");

In addition, there are two functions to calculate frequency count for discrete/categorical data. One is ocu_discrete_frequencies for text data, and the other is ocmath_discrete_frequencies for numeric data. Also, there are two functions to calculate frequency count on 2 dimensions: ocmath_2d_binning_stats and ocmath_2d_binning.

Correlation Coefficient

The ocmath_corr_coeff function is used to calculate the Pearson rank, Spearman rank and Kendall rank correlation coefficients.

matrix mData = {{10,12,13,11}, {13,10,11,12}, {9,12,10,11}}; 
int nRows = mData.GetNumRows();
int nCols = mData.GetNumCols();

matrix mPeaCorr(nCols, nCols);
matrix mPeaSig(nCols, nCols);

matrix mSpeCorr(nCols, nCols);
matrix mSpeSig(nCols, nCols);

matrix mKenCorr(nCols, nCols);
matrix mKenSig(nCols, nCols);

if(STATS_NO_ERROR == ocmath_corr_coeff(nRows, nCols, mData, mPeaCorr, mPeaSig, 
	mSpeCorr, mSpeSig, mKenCorr, mKenSig))
{ 
	out_str("Done");
}

Normality Test

Use the *ocmath_shapiro_wilk_test function to perform a Shapiro-Wilk Normality Test. Use the *ocmath_lilliefors_test function to perform a Lilliefors Normality Test. Use the *ocmath_kolmogorov_smirnov_test function to perform a Kolmogorov-Smirnov Normality Test.

vector vTestData = {0.11, 0.39, 0.43, 0.54, 0.68, 0.71, 0.86};

NormTestResults SWRes;
if( STATS_NO_ERROR == ocmath_shapiro_wilk_test(vTestData.GetSize(), vTestData, 
		&SWRes, 1) )
{
	printf("DOF=%d, TestStat=%g, Prob=%g\n", SWRes.DOF, SWRes.TestStat, SWRes.Prob);
}