2.75 FAQ-820 How to work with large datasets in Origin?
Last Update: 7/13/2018
Data Size Limits
Data in Origin can be contained in workbooks and matrices. Each workbook can contain up to 1024 worksheets. Within a given worksheet, the number of columns can be up to 65500, and the number of rows can be up to 90 million (64-bit OS). In practical terms, the maximum limits may be lower depending on your available system resources.
A matrix window can contain up to 1024 matrix sheets. Each sheet can contain up to 90 million matrix columns (1 row) or 90 million rows (1 column). Again, the maximum limits may be lower depending on your available system resources.
Sampling Interval Support
Origin worksheet columns support a Sampling Interval property. If the X values associated with a Y data set are evenly spaced, then that information can be stored as the Sampling Interval of the Y column. This allows the size of the Origin Project to be reduced by 50%, as the X column is no longer needed as an explicit column in the worksheet. This also improves the plotting and analysis speed of large datasets as the X information does not have to be read point-by-point from a worksheet column.
Graphing Large Datasets
By default Origin hides points when plotting large datasets. This is referred to as Speed Mode. For plotting matrix data in Speed Mode, fixed increments in both the X and Y dimensions are used to skip points. For worksheet data, a more sophisticated Speed Mode mechanism examines the nature of the data and selects a subset of points that represent the overall data shape. In the Speed Mode dialog (open with menu Graph: Speed Mode), you can select Speed Mode options of Low, Medium, High, or a Custom setting. The setting can be further saved as part of the Graph template, or as a theme for use with other graphs.
Importing Large Datasets
Many of Origin's import routines support partial importing, allowing you, for example, to import 5 rows, skip the next 20, import the next 5 and so on. A partial import of a large dataset lets you quickly examine the nature of the data, and also try various plotting and analysis routines on the data subset, rather than on the entire dataset. Once your graphs and analyses have been optimized you can use Origin's Re-Import feature to re-import the file in its entirety. Analysis results set to automatically recalculate, as well as any graphs you have created, will automatically update using the full dataset.
Analyzing Large Datasets
In addition to the ability to perform a partial import, Origin has flexible tools for graphically selecting a range of your data. These Region of Interest (ROI) tools make it possible to perform calculations, data processing, data analysis, etc on a subset of your already imported data. There are several tools for selecting data:
Data Selector, data markers
Graphically define a range for analysis using a beginning and ending data marker. ( button on Tools toolbar)
Regional Data Selector
Graphically define one or more ranges on one or more curves for analysis. The Regional Data Selector tools on Tools toolbar include the option of selecting only the Active curve , or all curves inside the region . When making your selection you can choose to use a rectangular window or free form shape. Analysis markers define the selected ranges.
Reduce Rows or Columns
You can reduce data before performing analysis or graphing. Please refer to this blog post for the detailed information.
Batch Process Multiple Data Files
Origin provides some batch processing tools to batch analyze or plot multiple files. Please refer to this blog post and these tutorials for the detailed information.
Keywords:large dataset, limitation, sampling interval, speed mode, skip rows, reduce, batch processing, regional selector, region