Given that it is possible to efficiently develop quality models, the auxiliary question is, "Why?" As illustrated below, there are many applications of models. It is important to understand both the application type as well as the associated requirements for model fidelity, durability, and timeliness in order to know when "good enough" has been achieved.
Thus, we need to define success: are we looking for insight, variable selection, predictive accuracy or model robustness and self-assessment?
The two most common uses of a model is as an emulator or as an inferential sensor. An emulator is a surrogate for another system and, as such, can be used to explore response behaviors or to identify optima or, in general, to gain insight into the underlying system.
Note that the targeted system need not be real; for example, in designing a (complex reaction) ceramic furnace, a finite-element model might require 24 hours of compute time to produce a single data point. Thus, we might want to use an emulator for the (FEM) emulator so that a coarse optimization of the design parameters could be done to accelerate the total design time.
An inferential sensor has the goal of inferring the current (or future) state of the system based upon available data. The distinction from an emulator is that an emulator has full awareness of the target system whereas the inferential sensor is attempting to predict a hard-to-measure quantity. For example, in a chemical plant direct awareness of the concentration of a reactant might involve grabbing a sample and running it through a chromatography — which would produce a direct measurement of the reactant at a point in the past. Conversely, we can use such information to build an inferential sensor to estimate the concentration in real-time using easily available measurements to achieve online monitoring.
Similarly, we might want to build an inferential sensor to predict the market price and volatility of electrical energy if we were a power generator. Thus, we can also distinguish between emulators and inferential sensors by their application usage: an emulator would typically be used for optimization and exploration whereas the inferential sensor is used for prediction and monitoring.
Variable selection is a very important product of the modeling process. Identifying and ranking the driving variables is a key ingredient to developing robust and accurate models. Related to this is the identification of variable transforms (metavariables). If we can identify meaningful transforms or combinations of variables, we can gain insight as well as, potentially, transform a nonlinear problem into a linear one. Variables selection and transforms may be used in conjunction with the developed models for human understanding and insight. This insight can accelerate research as the results are interpreted as mechanism cues or they can provide inspiration for new experiments or avenues of research. Finally, a significant opportunity is in the realm of nonlinear design of experiments (DoE). For example, we could start with a data set, build models and then use those models to guide us to where additional data should be gathered — either at a potential optima or at points in parameter space where the developed models disagree and, therefore, additional information is needed.