New Criteria for the Choice of Training Sample Size for Model Selection and Prediction: The Cubic Root Rule
Pericchi Guerra, Luis Raúl
MetadataShow full item record
The size of a training sample in Objective Bayesian Testing and Model Selection is a central problem in the theory and in the practice. We concentrate here in simulated training samples and in simple hypothesis. The striking result is that even in the simplest of situations, the optimal training sample M, can be minimal (for the identication of the sampling model) or maximal (for optimal prediction of future data). We suggest a compromise that seems to work well whatever the purpose of the analysis: the 5% cubic root rule: M = min[0.05 * n, ³√n]. We proceed to define a comprehensive loss function that combines identication errors and prediction errors, appropriately standardized. We find that the very simple cubic root rule is extremely close to an over- all optimum for a wide selection of sample sizes and cutting points that define the decision rules. The first time that the cubic root has been proposed is in Pericchi (2010). This article propose to generalize the rule and to take full statistical advantage for realistic situations. Another way to look at the rule, is as a synthesis of the rationale that justify both AIC and BIC.