Super Prev Next

Executable image optimization

Note: Information in this Chapter is applicable to Excelsior JET Professional Edition only.


Super Prev Next

Introduction

The memory paging technique used by many server and desktop operating systems, including Windows, creates a considerable time and memory overhead on application startup, because:

Therefore it is desirable to optimize the executable image of the application so as to minimize the number of pages to be loaded on startup and to ensure that the pages are accessed in an order that is close to sequential. This is done by tracing the application’s execution and reordering the contents of code and data segments in its executable file. Microsoft Office executables are known to be optimized this way, so they start much faster than typical Visual C++ executables of about the same size. Microsoft Platform SDK used to include a tool called Working Set Tuner for optimizing code layout (but not data layout) in C/C++ executables.

Thanks to the Java platform architecture and numerous dependencies between classes, executables created by JET may be quite large, which makes this optimization even more important. Moreover, Java applications have much more static data than C++ applications (such as reflection and security information), so there is a need to optimize the data segment as well.

The bigger the executable file, the more significant is the startup overhead described above. For large applications, the executable image optimization facility provided by Excelsior JET results in:

This optimization is adaptive, i.e. it requires you to run your application at least once to gather execution profile data. Any change to the application code may invalidate all or some part of that data and make the optimization less effective, so it is recommended to perform it when your compiled application is ready for deployment.

Executable image optimization may require much more system resources than compilation of your application and may take a considerable amount of time. So read the section System requirements carefully to see if your development machine satisfies the requirements.

Note: The images of JET run-time DLLs are pre-optimized for the “largest common denominator”, so you only need to optimize your application executables.


Super Prev Next

System requirements

Operating system

Executable image optimization may only be performed on an NT-based operating system, i.e. Windows NT 4, Windows 2000, or Windows XP. It cannot be done on Windows 95/98/ME, because those system lack certain APIs. Some Windows NT Server systems also do not provide those API. An optimized application, in turn, may be run on all Windows operating systems, including Windows 95/98/ME, benefiting from the optimization. So this limitation applies to the development machine only.

Disk space

During optimization, the executable is re-linked in a special “sparse” mode requiring a lot of disk space. The amount of disk space required depends on the size of the application, but it is generally 10-50 times greater than the size of the executable originally produced by JET. For example, if the size of your application’s executable is 10 megabytes, you may expect that its optimization will require 500 megabytes of disk space at the most.

System memory (RAM)

For running the “sparse” executable, 256 MB of physical memory is required, 512 MB or more recommended. More precisely, if you do not have enough physical memory, you can still perform image optimization if the amount of virtual memory is sufficient, but it would take much longer. The size of available virtual memory is limited by the size of the page file.


Super Prev Next

Optimization step by step

First of all, you should build the executable for your application, in exactly the same way as when preparing it for deployment to end user systems (i.e. without any debug routines and the like.) The only difference for GUI applications is that it is recommended to not suppress the console window, so that you could see warnings and errors emitted during profiling. To do that, uncheck the Suppress console window checkbox on the Target page, or turn OFF the option GUI in the project file.

During the build process, the compiler creates object files, usually placed in the obj subdirectory of your build directory, and a linker response file with the name of your JET project file and extension .rsp (when compiling without a project file, the name of the linker response file is tmp.rsp). Those files are required to perform executable image optimization.

Executable image optimization is not supported in the JET Control Panel yet, so all the steps of optimization may be performed only from the command line. Below, it is assumed that the current directory is the directory where you have built the executable for your application.


Super Prev Next

Step 1. Relink the executable in “sparse” mode

The purpose of this step is to re-link the executable in a special “sparse” mode to enable precise profiling of image loading. That “sparse” executable will be really huge / Don’t be scared — the final, optimized executable will be of the same size as the original one./ , so be sure to have enough free disk space (as described in section System requirements).

Issue the following command:

    xlink /Sparse /WriteLinkInfo @<rsp-file>

If there are no errors, a profiling-ready executable and a file containing link information that describes the structure of that executable is generated. The link information file has the name of the executable and extension .li.

For instance, if the name of the JET project file for your application is MyApp.prj, issue the command:

    xlink /Sparse /WriteLinkInfo @MyApp.rsp

This will create files MyApp.exe (sparse) and MyApp.li.


Super Prev Next

Step 2. Profile your application

On this step, you will create a precise execution trace for your application. Define the property jet.profile.pagefaults and launch the sparse executable created on the previous step. The property may be defined using the JETVMPROP environment variable:
    SET JETVMPROP=-Djet.profile.pagefaults
    MyApp.exe

The execution trace of your application will be logged and later used for optimization, so it is a good idea to start your application in the most “usual” way. For instance, a word processor is typically started with the name of the document to open passed as a command-line argument.

If you cannot guess the most typical way of usage of your application, run it just until the end of the default startup sequence, then exit.

Our tests showed that regardless the number of possible execution paths existing for your application, its startup will always be done in the same order, maybe with minor variations. More important is that the major part of the executable’s image used in further execution will be loaded at startup. So you are free to choose the way you run the application for profiling, just keep in mind that the optimized application will work in that way a little bit more efficient, but without detriment to other execution paths.

Watch for warning and error messages during the profiling of your application. The most important messages are:

    Page faults profile not available:
    Process Status API is not available

This message means that you cannot profile your application on this operating system (see section System requirements for details).

    WARNING: Page faults buffer overflow -     not all page faults available

This message warns you that the JET runtime did not manage to retrieve the contents from the system page faults buffer in time. As a result, the gathered execution trace will be incomplete. This is actually a timing effect of the Windows thread switching mechanism. If you see this message, stop and restart your application. If the problem persists, reboot the system to invalidate the disk cache, for your application to load slower and leave time for buffer retrieval.

Upon application exit, a file with profiling information will be created in the current directory, with the name of executable and extension .pf, e.g. MyApp.pf.


Super Prev Next

Step 3. Generate optimal link information

On this step, you will use the xreorder utility and the files created on previous steps to produce the new link information file defining the image structure for optimal execution.

xreorder takes four arguments:

For example, if the executable’s name is MyApp.exe and it resides in the directory C:\My App, the xreorder command line would be:

    xreorder MyApp.li MyApp.pf "C:\My App\MyApp.exe" MyApp.li

Upon successful completion, xreorder produces an optimized link information file.


Super Prev Next

Step 4. Create the optimized executable

To create the optimized executable, run xlink again with the following command line:

    xlink /ReadLinkInfo=<optimized-li-file> @<rsp-file>

Note: Do not recompile your application! The object files and the linker response file must be the same as on Step 1.


Super Prev Next

Advanced techniques


Super Prev Next

Optimization of multi-component applications

For applications consisting of multiple components (EXE and DLLs), optimization of each component is required. Only one component may be optimized at a time.

EXE optimization was examined in detail in section Optimization step by step. Optimization of a DLL (dynamic link library)is very similar. First of all, create a DLL in the “sparse” mode as described in Step 1. Relink the executable in “sparse” mode. Then execute your application in profiling mode as described in Step 2. Profile your application. It is not necessary to link the EXE in “sparse” mode, only the DLL you are optimizing. To create the optimized link information file, specify the name of the DLL as the third argument of the xreorder utility (Step 3. Generate optimal link information). Finally, re-link the DLL with optimized link information (Step 4. Create the optimized executable).

DLLs are often shared among several applications. If that is the case, consult the section Mixing results of multiple profiling runs.


Super Prev Next

Mixing results of multiple profiling runs

In certain situations you may want to apply results of several profiling runs to the resulting executable. For instance, if you have a component (DLL) which is shared among several applications, you may wish to optimize it for all its usages. Another case would be an application with several considerably different typical execution scenarios.

Suppose you have a DLL shared among two applications, so you have performed Step 2 (Step 2. Profile your application) for it two times and you now have two different profiling information files for that DLL. Now, decide which of the applications is less important, i.e. used less often, has easier requirements to startup time and memory consumption, etc. The idea is to make the use of the DLL efficient in the less important application to the extent not compromising its efficiency in the more important application.

Once you have chosen the least important profile, reorder the link information file using the xreorder utility with that profile. Now, reorder that intermediate link information file using the more important application’s profiling information. The resulting link information file for the DLL will be fully optimized for the more important application and somewhat optimized for the less important application. Now, re-link the DLL using that file as described in Step 4. Create the optimized executable.

The case of more than two profiles is similar (gather all profiles and then apply them one by one from the least important to the most important).


Super Prev Next

Reuse of optimized link information file

The situations when you discover that you have to make a little fix right after you have just built and optimized the final version of your application are rare, but still possible. You may decide not to perform the full cycle of image optimization over again, but just link using the previously optimized link information file. If the changes you have introduced are local and do not considerably affect the way your application executes, there is a big chance for you to succeed. On the one hand, you may come across a situation where the startup performance is not improving when you link using an outdated link information file or a link error occurs. On the other hand, your application may execute optimally even after such re-linking.

In general, it is recommended to link using an outdated link information file only in extraordinary cases, such as a last minute fix.