Key Features

The key features of the DataModeler are:

  • Symbolic Regression via Genetic Programming — developed models are human-interpretable expressions. This lets the data define models that can easily be examined to yield insight and understanding.
  • Pareto-aware Genetic Programming — the model development explores the trade-off between model complexity and accuracy.and allows the user to decide in post-processing where the best balance is on this trade-off.
  • Driving variable selection and variable transform identification — the most important variables and variable combinations will naturally be selected. In addition to human insight, this information can be exploited by other modeling techniques such as neural networks or linear model building.
  • Ability to handle ill-conditioned data— insufficient data arrays (with more variables than data points) as well as correlated variables can be naturally handled and valid models developed.
  • Models with trust metrics — ensembles of independent models can be identified with their consensus used as an indicator of the model trust. This allows detection when data outside the training behavior is encountered or the underlying system being modeled has changed.
  • Intuitive model exploration and performance assessment — tools are provided to explore and select models for exploitation.
  • Automatic sensitivity analysis — predictions from the selected models and model ensembles can be interactively explored to facilitate design space exploitation.
  • Model lifecycle management — efficient model development from a human perspective is critical. Tools are provided for data exploration and conditioning, model archival and model maintenance.
  • Real-world case studies — the package was developed in an industrial setting to address real-world modeling problems. Key features are demonstrated via case studies using real-world data and real-world problems of research data exploration, emulator development, inferential sensors and financial data analysis.
  • Rapid model development — the state-of-the-art algorithms are orders-of-magnitude faster than conventional symbolic regression algorithms. This enables timely and effective model development against real-world data sets.
  • General-purpose utilities — a variety of capabilities for using Mathematica against real-world data and demands are also included.
  • Automatic reporting and presentation capabilities — every data exploration process in DataModeler results in an interactive report.
  • Multi-core support — data analysis in DataModeler lets you fully exploit multi-core architecture to reduce computation time and increase robustness of solutions.

The illustrations page is a good place to get started to see the unique advantages of the DataModeler algorithms. The selected help extracts may also be of interest. The publications and talks are also, likely, worth exploration.