next up previous contents
Next: Experiments Up: ProGenitor Previous: Probability Distributions

 

5.3 Program Generation

Explaining how ProGenitor works is probably best done by example. We will describe a sample execution of ProGenitor using the input parameters listed in Table 5.1.

In addition to fixed values, some of the parameters have been specified as following either a uniform or normal distribution, as described earlier in this chapter. Uniform distributions are specified with a ``u'' followed by the minimum and maximum allowed values. Normal distributions are specified with an ``n'' followed by the mean, the standard deviation, and the minimum and maximum allowed values, in that order.

The first thing ProGenitor does is generate a set of user source ``files'' and a set of user header ``files.'' It is important to note that the ``files'' generated are not actual disk files. Rather, they are objects internal to ProGenitor that correspond to source and header files of a program.

  table300
Table 5.1: Example ProGenitor Input 

The number of source and header files generated is dictated by the input parameters. For example, the value given for the parameter Number of Source Files in the User File Parameters section of Table 5.1 is ``u 2 5.'' This means that the number of user source files should be between 2 and 5 (inclusive), and furthermore, if the same inputs were supplied to ProGenitor multiple times, then the distribution of the number of user source files across the multiple runs would be uniform.

So, the first thing ProGenitor will do is pick the number of user source files to generate using standard pseudo-random number generation techniques. Similarly, ProGenitor will pick a value (based on a uniform distribution) between 4 and 6 for the number of user header files.

ProGenitor will then pick values for the number of library source files and header files in the same way, using the parameters in the Library File Parameters section of Table 5.1.

Once the source and header file objects have been generated, ProGenitor will decide how many source files should include each header file and will link the files appropriately. In Table 5.1, the ``Popularity'' of Header Files is given as ``n 2 5 1 5.'' This indicates that the distribution of values chosen for this parameter should follow a truncated normal distribution with a mean of 2, a standard deviation of 5, and bounded by the values 1 and 5. ProGenitor will generate a separate (but not necessarily unique) value following this distribution for each header file. It will then randomly pick a corresponding number of source files to include each header file. However, like in most real programs, User source files are allowed to include library header files, but library source files are not allowed to include user header files.

We now have our source and header files linked with appropriate dependencies. This is enough to generate the first section of the setup file.

Next, ProGenitor creates the ``functions'' contained in the source files. (As with the files, these are not real functions, but rather are internal objects representing functions.) For each source file, ProGenitor picks a value according to the parameter Functions per File and creates a correspondingly sized set of function objects for the file. We now have enough information to generate the second section of the setup file.

ProGenitor then uses the Lines per File parameter to generate the length of each file. The compilation time for each file is then calculated using the specified coefficients for the quadratic equation. This gives us enough information to generate the third, and last, section of the setup file.

Now that the setup file is complete, ProGenitor prepares to create the behavior trace. First, however, the remaining static properties of the functions are generated. ProGenitor assigns each function a number of call sites and a maximum recursion depth based on the corresponding input parameters. The number of call sites is the number of other functions that the given function will call whenever it is executed. The call sites are modeled as an ordered set of objects so that they will be visited in the same order every time the function is executed.

We are now ready to simulate execution. A user function is picked to be the starting point, the clock counter is set to 0, and a function-entry event is output to the behavior file.

ProGenitor then proceeds to simulate execution of the function. First, it uses the Execution Time between Calls parameter to increment the clock counter. Then it visits the first call site of the function.

Since this is the first time this call site has been visited, ProGenitor has a little extra work to do. First of all, it needs to decide if this call site will remain fixed (i.e., call the same function) for the entire execution of the program or if it will vary (modeling a function pointer). ProGenitor decides this based on the Chance of Call Site Being Fixed parameter. A value, d, is generated according to specified distribution. Another value, p, following a uniform distribution and ranging from 0 to 1 is generated. If p < d then the call site will be fixed, else it will be variable.

ProGenitor then needs to decide if this call will be to a library or another user function. This is determined using the Chance of Calling a User Function parameter in a manner identical to that of deciding if the call site was fixed. Once this is decided, ProGenitor will randomly pick a function from the proper set. If the call site is to be fixed, then this function is permanently assigned to the site; otherwise, it is only used for this particular call.

ProGenitor now enters the chosen function and another function-entry event is output to the behavior file. From now on, the first thing ProGenitor does upon entering a function is check to make sure that it has not exceeded the maximum recursion depth assigned to that function. If it has, then it simply returns to the calling function (outputting the appropriate function-exit event to the behavior file) and continues execution from there.

The next thing ProGenitor does upon every function entry is generate a probability for the function just returning in a manner similar to that of deciding if a call site was fixed. If this value is affirmative, then ProGenitor immediately returns to the calling function (again outputting a function-exit event).

If it is decided that this function is to be executed, then ProGenitor proceeds just as with the first function: the clock counter is advanced and the first call site is visited.

ProGenitor continues simulating execution of the program in this manner until one of two things happens. First, if run long enough, ProGenitor will eventually return all the way up the call stack; that is, the starting function will return. If this happens, then execution of the simulated program is said to have terminated normally.

The second possibility is related to a small detail omitted in the above discussion. It is possible to specify a ``maximum execution time'' to ProGenitor. This is not the maximum amount of time that ProGenitor itself will run but rather the maximum amount of time for the simulated program to run. If a maximum execution time is given, then at every step ProGenitor checks the current value of the clock counter against this maximum time. If the simulated program has not terminated normally by the time this value is reached, then the behavior trace is cut off abruptly, emulating the abnormal termination of the program.

In either case, we now have a viable behavior file to use with our continuous compilation simulator.


next up previous contents
Next: Experiments Up: ProGenitor Previous: Probability Distributions