News archive for March 2011

DataModeler release 23.0 (7 March)

Monday, March 7, 2011

Tuning the user experience and support for exploratory modeling is the current theme along with the ever-popular documentation development.

  • Introduced a new function, ResponsePlotExplorer which wraps a Manipulate around the ResponsePlot of a supplied model or ensemble. By default, sliders are shown for the ModelVariables used by the model.
  • Modified BivariatePlot to use a GridTable rather than a GraphicsGrid. This allows labels on the frame of the grid of plots which is useful if long DataVariableLabels are being used or there are more than a handful of variables being plotted. The previous behavior of showing the DataVariableLabels on the diagonal histograms is no longer used; however, each histogram now has a tooltip of the data variable label of the column being plotted.
  • Modified CorrelationMatrixPlot to be able to handle missing elements in the nominally numeric data columns. Previously, a non-numeric would cause any correlations dependent upon that column to appear as blood red. Now a warning message is presented and any non-numeric doublets deleted from the Correlation calculation.
  • Renamed the ModelingVariables option for VariablePresenceMap to be VariablesToPlot. This makes the option name clearer. Now ModelingVariables is strictly used by SymbolicRegression to define the (possible) subset of supplied DataVariables from which models should be developed.
  • Modified CreateModelEnsemble, AlignModel, AlignModelExpression and EvaluateModelQuality to handle input-response data with a mixture of numerics and Indeterminate values. Now an Indeterminate in the response or one of the ModelVariables used in the model being evaluated will cause that data record to be eliminated from the assessment. This seems reasonable in that a model should not be punished because it was provided with incomplete data.
  • Introduced a new function, ClipUnitStep which clips the input from zero to one. Although this could be used as a TemplateTopLevel during SymbolicRegression when the targeted response is naturally a fraction, it might be better as a post-modeling constraint since the direct approach inflicts a substantial efficiency penalty since a scale-invariant ModelingObjective isn't applicable and, as a result the scaling and translation factors must be evolved during the modeling rather than being identified post-facto. However, devoting some extra time can, likely, resolve that concern.
  • Modified EnsemblePredictionPlot and EnsembleResidualPlot to eliminate the error bars associated with missing data points since they cluttered the view and didn't add any insight. A warning message as to the extent of the missing data impact is generated.
  • Modified VariablePresenceTable to only show the "Meaning" column if DataVariableLabels are supplied that are different than the DataVariables.
  • Modified VariablePresenceTable and VariablePresenceMap to use an AllModels SelectionStrategy if presented with a ModelEnsemble. Also modified the option settings to directly associated SelectionStrategy - > ParetoFrontSelect and SelectionSize -> 0.5 to focus on the 50% of models closest to the ParetoFront. This differs from the behavior of the VariablePresence function (which uses AllModels); however, for model set exploration, this seems to be a more natural default.