Enabling Automatic Optimizations

This topic lists the most common code optimization options, describes the characteristics shared by IA-32, Intel® 64, and IA-64 architectures, and describes the general behavior for each architecture.

The architectural differences and compiler options enabled or disabled by these options are also listed in more specific detail in the associated Compiler Options topics; therefore, each option discussion listed below includes a link to the appropriate reference topic.

Linux* and Mac OS* X

Windows*

Description

-O1

/O1

Optimizes to favor smaller code size and code locality. In most cases, -O2 (Linux* and Mac OS* X) or /O2 (Windows*) is recommended over this option.

This optimization disables some optimizations that normally increase code size. This level might improve performance for applications with very large code size, many branches, and execution time not dominated by code within loops. In general, this optimization level does the following:

  • Enables global optimization.

  • Disables intrinsic recognition and intrinsics inlining.

IA-64 architecture:

  • The option disables software pipelining, loop unrolling, and global code scheduling.

To see which options this option sets or to get detailed information about the architecture- and operating system-specific behaviors, see the following topic:

  • -O1 compiler option

-O2 or -O

/O2

Optimizes for code speed. Since this is the default optimization, if you do not specify an optimization level the compiler will use this optimization level automatically. This is the generally recommended optimization level; however, specifying other compiler options can affect the optimization normally gained using this level.

In general, the resulting code size will be larger than the code size generated using -O1 (Linux and Mac OS X) or /O1 (Windows).

This option enables intrinsics inlining and the following capabilities for performance gain: constant propagation, copy propagation, dead-code elimination, global register allocation, global instruction scheduling and control speculation, loop unrolling, optimized code selection, partial redundancy elimination, strength reduction/induction variable simplification, variable renaming, exception handling optimizations, tail recursions, peephole optimizations, structure assignment lowering optimizations, and dead store elimination.

IA-64 architecture:

  • Enables optimizations for speed, including global code scheduling, software pipelining, predication, speculation, and data prefetch.

To see what options this option sets, or to get detailed information about the architecture- and operating system-specific behaviors, see the following topic:

  • -O2 compiler option

-O3

/O3

Enables -O2 (Linux and Mac OS X) or /O2 (Windows) optimizations, and enables more aggressive optimizations such as prefetching, scalar replacement, cache blocking, and loop and memory access transformations.

Enables optimizations that might result in maximum speed but does not guarantee higher performance unless loop and memory access transformation take place. The optimizations enabled by this option can slow down code in some cases compared to -O2 (Linux) or /O2 (Windows) optimizations.

Recommended for applications that have loops that heavily use floating-point calculations and process large data sets.

Like the other code optimization options, this option behaves differently depending on architecture and operating system.

IA-32 architecture:

  • When used with the -ax or -x (Linux) or /Qax or /Qx (Windows), this option causes the compiler to perform more aggressive data dependency analysis than for -O2 (Linux) or /O2 (Windows); however, this scenario might result in longer compilation times.

  • -xP and -axP are the only valid values on Mac OS* X systems.

IA-64 architecture:

  • Enables optimizations for technical computing applications (loop-intensive code): loop optimizations and data prefetch.

For more information, see the following topic:

  • -O3 compiler option

-fast

/fast

Provides a single, simple optimization that enables a collection of optimizations that favor run-time performance.

This is a good, general option for increasing performance in many programs.

For IA-32 and Intel® 64 architectures, the -xP (Linux and Mac OS X) or /QxP (Windows) option that is set by this option cannot be overridden by other command line options. If you specify this option along with a different processor-specific option, such as -xN (Linux) or /QxN (Windows), the compiler will issue a warning stating the -xP or /QxP option cannot be overridden; the best strategy for dealing with this restriction is to explicitly specify the options you want to set from the command line.

Caution

Programs compiled with the -xP (Linux and Mac OS X) or /QxP (Windows) option will detect non-compatible processors and generate an error message during execution.

While this option enables other options quickly, the specific options enabled by this option might change from one compiler release to the next. Be aware of this possible behavior change in the case where you use makefiles.

For more information on restrictions and usage, see the following topic:

The following syntax examples demonstrate using the default option to compile an application:

Platform

Example

Linux and Mac OS X

ifort -O2 prog.f90

Windows

ifort /O2 prog.f90

Refer to Quick Reference Lists for a complete listing of the quick reference topics.