The goal of scenario 3 involved assessing the portability of ACE+TAO
via execution-based test cases and run-time options by executing the
ACE+TAO regression tests over all settings of their
run-time options (such as when to flush cached connections or
what concurrency strategies the ORB should support. See
Table
for a summary of option settings).
We modified the configuration model to reflect 6
run-time configuration options. Overall, there were 648 different
combinations of CORBA run-time policies
Table: Six ACE+TAO Run-Time Options and their Settings.
After making these changes, the compile-time option space had 14 options and 12 constraints, there were 120 test-specific constraints, and 6 run-time options with no new constraints. Thus, the configuration space for this scenario grew to 18,792 valid configurations (648 run-time x 29 compile-time configurations). At roughly 30 minutes per testsuite, the entire test process involved around 9,400 hours of computer time on the Skoll grid.
Several tests failed in this scenario, even though they had not failed in scenario 2 when they were run with default run-time options. These problems were often located in feature-specific code. Interestingly, some tests failed on every single configuration (including the default configuration tested earlier), despite succeeding in Scenario 2! These problems were often caused by bugs in option setting and processing code. ACE+TAO developers were intrigued by these findings because in practice they rely heavily on testing by users at installation time (scenario 2) to verify proper installation and provide feedback on system correctness. Our feasibility study raises some questions about the adequacy of that approach.
Another group of tests had particularly interesting failure patterns. Three of these tests failed between 2,500 and 4,400 times (out of 18,792 executions). We discovered that the failures occurred only when ORBCollocation = NO was selected ( i.e., no other option influenced these failures). This option allows objects within the same address space to communicate directly, saving (de)marshaling and protocol processing overhead. The fact that these tests worked when objects communicated directly -- but failed when they talked over the network -- suggested a problem related to message passing. In fact, the source of the problem was a bug in the ACE+TAO routines for (de)marshaling object references. Our DCQA process thus helped us to not only systematically evaluate the functional correctness PSA across many different runtime configurations, but also leveraged that information to help pinpoint the causes of specific failures.