If a large system can be decomposed into meaningful subunits, understanding may be gained by examining the relationships of subunits to the high level system outputs and between subunits. If users of the system conceive of the system as a agglomeration of subsystems, a modeling methodology that explains the interdependencies and effects of the subcomponents may help build understanding of the system.
For instance: if the intermediate monitoring variables are used as the sole predictor of the higher level system output, the resulting predictions can only be as good as the best prediction based on the intermediate variables. Several studies have identified an increase in US citizen's heights with an increase in nutrition; using height as a sole surrogate variable to summarize the nutritional variables in prediction of health will limit the results to the information present in the height variable. The intermediate variables can be used, but they should be augmented with other information to produce an improved estimate.
Risk versus Odds Two methods of characterizing process losses are an odds and as risk. Odds of failure ([(failures)/(successes)]) are a simple ratio of failures to successes with the domain (-¥,+¥), while risks are a ratio of ([(failures)/(successes+failures)]) and have the domain [0,1]. In a manufacturing process, risks or failure rate might be more commonly used, but for modeling purposes, odds has better numerical properties.
Improvements due to combining submodels: Supposing that the estimate of the defect rate due to some submodel Mi [^(yi)] = fi([(bi)\vec],[(Xi)\vec]) where [(Xi)\vec] is a vector of variables in submodel i and [(bi)\vec] is a vector of parameters, and fi() is an estimation model, the model [^(yi)] will be able to predict the system output with some level of accuracy. Supposing further that i=10 subsystems are independent and that each is responsible for 1/20 of the variance observed in the output, with the remaining 1/2 of the variance due to other sources. Each of the 10 models can explain at most 5% of the variation, and may seem insignificant, but a combination of the subsystems has the potential to explain 1/2 of the variance in the entire system.
If the entire system is small enough to be modeled in one system, that would assuredly produce better prediction results, however, segmentation into subcomponents can perhaps aid in the interpretation of the resulting model.
Interdependencies between the subsystems can be reflected in the correlations between the subsystem outputs.