2.4.1 findBase


Brief Information

Find a region of XY data suitable for baseline

Additional Information

Minimum Origin Version Required: 8.0 SR5

Command Line Usage

1. findBase iy:=(1,2) pts:=4 max:=0.5;

X-Function Execution Options

Please refer to the page for additional option switches when accessing the x-function from script

Variables

Display
Name
Variable
Name
I/O
and
Type
Default
Value
Description
Input Data iy

Input

XYRange

<active>
Specify the input data.
Minimum Continuous Flat Points ( 0.5 =half, 5 = 5 points) flat

Input

double

15
This is actually used only when dir is set to 0. You can use this to define the minimum length of the baseline. If a flat section is longer than this variable, this section will be regarded as the baseline. If the value is smaller than 1, it will be viewed as the ratio of the baseline length to the length of the whole input curve; otherwise, it will be viewed as the minimum length of the baseline in points.
Region Options(0=any, >0 for from begin, <0 from end) dir

Input

int

0
Use this option to determine the position of baseline. If the value is 0, it means the baseline can appear at any position of the curve; if the value is positive, it means the baseline is at the beginning of the curve; if the value is negative, it means the baseline is at the end of the curve. If the value is non-zero, the absolute value of it will be used as the position between the first point to be searched for baseline (or the first point with which the piecewise linear fit begins) and the beginning (if the value is positive) or the end (if the value is negative) of the curve.
Slope Threshold h

Input

double

0
Specify a slope threshold. The slope of the baseline should be less than this.
Initial Points (0.01=1%, 5 = 5 points) pts

Input

double

4
Specify the number of points of first segment on which linear fit is performed so as to find the baseline if dir is not equal to 0. If dir = 0, this will be the size for all linear fit segments.
Check Linear Tolerance tol

Input

double

10
Specify the tolerance factor used to multiply the slope/intercept error value for continuous linear test.
Max Number of Points to Seach(0.5 = half, 50 = 50 points) max

Input

double

0.5
Specify the maximum number of points used for searching the baseline. If the value is less than 1, the value will be viewed as the percentage of the maximum number of points to the size of the whole curve. Otherwise, this value will be viewed as the the maximum number of points used for searching the baseline.
Number of Points to Increment to Search step

Input

double

0
If dir = 0, this variable has no effect. Otherwise, this will be used to specify the distance between two adjacent segment on which linear fit is performed to find where the linear segment ends.
Begin Index of Base Region i1

Output

int

<>
Specify the output of the beginning index of the baseline.
End Index of Base Region i2

Output

int

<>
Specify the output of the endding index of the baseline.
Intercept a

Output

double

<>
Specifyt the output for the intercept of the baseline.
Slope b

Output

double

<>
Specify the output of slope of the baseline.
Intercept Error aerr

Output

double

<>
Specify the output of the intercept error of the baseline.
Slope Error berr

Output

double

<>
Specify the output of the slope error of the baseline.
Pearson r (correlation coefficeint) r

Output

double

<>
Specify the output for the adjusted residual sum of squares of the linear fitting used to find the baseline.
Fitted Line oy

Output

XYRange

<optional>
Specify the output range of the fitted baseline data.
1=Show Internal Messages cntrl

Input

int

0
Specify whether to show the internal messages

Description

This function is used to find baseline region for data that have peaks. It performs piece-wise linear fit to a section of data and compares the changing slope with the error estimate to determine if a linear region has started turning.

Example

  • Code Sample
// single peak integration by auto finding base
// on both sides of the peak
newbook;
string ff$=system.path.program$+ "Samples\Curve Fitting\Gaussian.dat";
impASC ff$;
plotxy (1,2); 

findbase d:=2 f:=0.1 p:=6 t:=5;i1=findbase.i2;
findbase d:=-2 f:=0.1 p:=6 t:=5;i2=findbase.i1;

range aa = 1[$(i1):$(i2)];
integ1 aa oy:=<optional>;
type "Integration of "+integ1.iy$;
type "from "+integ1.x1$+" to "+integ1.x2$;
type "Area = $(integ1.area)";
type "Center=$(integ1.x0)";
// also draw vertical lines to show on graph the integration range
draw -l -v integ1.x1;
draw -l -v integ1.x2;

Algorithm

When dir > 0

This means that the baseline, which is a continuous linear segment, appears at the beginning of the input curve.

The X-Function performs piece-wise linear fit to find such a continuous linear segment. Linear fit is performed on sections of data. And then the change of slope is used to determine if the linear region has ended.

Let us set the following:

n = npts + step;

piece_size = 3 * step;

m = n + piece_size;

where dir, pts and step are variables of the X-Function.

First, the X-Function performs linear fit to the first pts points. The slope value is set as v0. Then linear fit is performed on the the nth to mth data point. The slope value is set as vn and the slope error is set as vErr. Then the X-Function checks whether the following in true:

abs(vn - v0) > vErr*dR where dR is the tolerance specified by the tol variable.

If this is true, it means that the linear region has ended. Only the section that contains the first npts points is the linear part. And this part will be viewed as the baseline.

If this is not true, it means that the linear region continues in the second segment. The X-Function continues to search where the linear part ends. The v0 is set as vn. Linear fit is performed on the next segment which is step points to the right of the current segment. The new vn is set to the slope value of this linear fit. Also, vErr is set to the new slope error. Again, the X-Function checks whether the linear part has ended with

abs(vn - v0) > vErr*dR

This repeats until the X-Function finds that the linear part is ended or the maximum data points to performed piecewise linear fit has reached. The continuous linear region that starts from the beginning of the input curve will be viewed as the candidate baseline. Finally, this candidate baseline will be checked to see whether it is good enough. See checking the candidate baseline below.

When dir < 0

This means that the baseline, which is a continuous linear segment, appears at the beginning of the input curve. The basic idea for searching this continuous linear segment is actually the same as the one for the "dir > 0" case, except that the piece-wise linear fit begins from the rightmost point of the input curve.

When dir = 0

This means that the baseline can appears on any location of the input curve.

The X-Function searches continuous linear segment from the left to right with the method used in the "dir>0" case. If any continuous linear segment that has a length which is greater than the minimum length of baseline determined by the flat variable, the search stops and this linear segment will be viewed as the candidate baseline. Then, this candidate baseline will be checked to see whether it is good enough. See checking the candidate baseline below.

Checking the candidate baseline

The candidate baseline will be checked to see if it is good enough. All linear parts in it is found. And only the part whose slope is less than the slope threshold specified by the h variable of the X-function will be used as the final baseline.

References

Related X-Functions


Keywords:spectroscopy