Overview of AMD CodeAnalyst
Program Performance Tuning
The program performance tuning cycle is an iterative process:
- Measure program performance.
- Analyze the results and identify program hot-spots.
- Identify the cause for any performance issues in
the hot-spots.
- Change the program to remove performance issues.
AMD CodeAnalyst assists all four steps by collecting performance data,
by analyzing and summarizing the performance data, and by presenting it
graphically in many useful forms (tables, charts, etc.). CodeAnalyst directly
associates performance information with software components such as processes,
modules, functions and source lines. CodeAnalyst helps to identify the
cause for a performance issue and where changes need to be made in the
program.
The performance tuning cycle resembles the classic "scientific
method" where a hypothesis (about performance) is made and then the
hypothesis is tested through measurement. Measurement and analysis provide
an objective basis for tuning decisions.
Performance analysis and tuning with CodeAnalyst consists of six steps:
- Prepare the application for analysis by compiling
with generation of debug information turned on (an optional step).
- Select the kind of data to be gathered by choosing
one of several predefined profile configurations.
- Configure run options such as the application program
to be launched, the duration of data collection, etc.
- Start and perform data collection.
- Review and interpret the summarized results produced
by CodeAnalyst.
- Make changes to the program's algorithm and source
code, recompile/link, and analyze again.
Kinds of Analysis
AMD CodeAnalyst is a suite of tools that help improve the performance
of an application program or system. CodeAnalyst provides several different
ways of collecting and analyzing performance data.
- Time-based profiling
(TBP) shows were the application program or system is spending
most of its time. This kind of analysis identifies hot-spots that are
good candidates for tuning and optimization. After making changes to the
code, time-based profiling can evaluate, measure, and assess improvements
to performance. It can also verify that the modifications improved execution
speed and calculate by how much.
- Event-based profiling
(EBP) uses the performance monitoring hardware in AMD processors
to investigate hot-spots. This kind of analysis identifies potential performance
issues such as poor data access patterns that cause cache misses. An event-based
profile can identify the reason for a performance issue as well as the
code regions that may be performance culprits. Event-based profiling can
test hypotheses about a performance issue to identify and resolve it.
When multiple events are sampled, an event profile shows the proportion
of one event to another. See Performance Monitoring
Events for descriptions of the events supported by AMD processors.
- Instruction-based
sampling (IBS) also uses the performance monitoring hardware.
This kind of analysis identifies the likely cause of certain performance
issues and associates those issues precisely to specific source lines
and instructions.
- Basic Block
Analysis statically analyzes the assembly instructions to identify basic
blocks and aggregates data accordingly.
- In-line Analysis
allows users to aggregate samples into either in-line functions or
in-line instance.
- Session Diff
allows comparison of profiling sessions based on symbols.
Analysis usually begins with time-based profiling in order to find time-critical
and time-consuming software components. Event-based profiling or instruction-based
sampling is usually employed next in order to determine why a section
of code is running more slowly than it should.
Flexible, System-Wide Data Collection
CodeAnalyst's data collection is system-wide, so performance data is
collected about all software components that are executing on the system,
not just the application program itself. CodeAnalyst collects data on
application programs, dynamically loaded libraries, device drivers, and
the operating system kernel. CodeAnalyst can be configured to monitor
the system as a whole by not specifying an application
program to be launched when data collection is started. Time-based profiling,
event-based profiling, and instruction-based sampling collect data from
multiple processors in a multiprocessor system. CodeAnalyst can also be
used to analyze Java just-in-time (JIT) code.
Summarized Results with Drill-down
CodeAnalyst summarizes and displays performance information at several
levels of granularity or "aggregation:"
- Process
- Module
- Function
- Source line
- Instruction
The CodeAnalyst graphical user interface organizes and displays information
at each of these levels and provides drill-down. Thus, CodeAnalyst provides
an overview of available performance data (by process or by module) followed
by drill-down to functions within a module, to source lines within a function,
or even the instructions that are associated with a line of source code.
Graphical
User Interface
The CodeAnalyst graphical user interface (GUI) provides an interactive
workspace for the collection and analysis of program and system performance
data.
Projects and Sessions
The CodeAnalyst GUI uses a project- and session-oriented user interface.
A project retains important settings to control a performance experiment
such as the application program to launch and analyze, settings that control
data collection, etc. A project also organizes performance data into sessions.
A CodeAnalyst session is created when performance data is collected through
the GUI or when profile data is imported into the project. (The Oprofile
command line utility is an alternative method for collecting data.) Session
data is persistent and can be recalled at a later time. Sessions can be
renamed and deleted.
Summary of GUI
features
The CodeAnalyst GUI offers many features that make it easy to collect,
view and analyze performance data. This subsection summarizes the main
features offered by the CodeAnalyst GUI.
Organize and manage performance data in a CodeAnalyst project where
a project consists of one or more sessions.
Configure and control program execution, and data collection such as:
- Specify data collection parameters (e.g., sampling
interval, trigger events, trigger event count, inclusion of system/user
mode samples, etc.)
- Manually start and stop data collection
- Delay data collection for a specified period of time
to avoid taking measurements during a programs start up phase
- Stop data collection when the monitored program terminates
or after a specified time (duration) has expired
Define important program properties and project options such as:
- Path to the executable binary image
- Path to program source code
Collect performance data using time-based profiling, event-based profiling,
and instruction-based profiling where available.
Display performance information in different formats such as:
- Table
- Bar graph
- Annotated source and assembler code
Display performance information in different views such as:
- System Data and System Graph views to identify hottest
code modules and CPUs
- Module Data and Module Graph views to drill down
into a single module to identify hot procedures and code within a module
- Source and Assembly Views to see the time and event
data associated with individual statements and instructions
Capture and save code produced by Java just-in-time (JIT) compilation.
Import performance data that was collected using the Oprofile command
line utility.
Export data to allow post-processing using a spreadsheet program (or
a user-written custom application).
Basic Steps for Analysis
The CodeAnalyst graphical user interface provides features to set up
a performance experiment, run the experiment while collecting data, and
display the results. The basic steps are:
- Open an existing project or create a new project.
- Set up basic run parameters like the program to launch,
the working directory, etc.
- Select a predefined profile (data collection) configuration.
- Collect a time-based profile, event-based profile,
or IBS-based profile as selected by the profile configuration.
- View and explore the results.
- Save the project and session data to review it later
or to share it.
CodeAnalyst and Oprofile
CodeAnalyst leverages a third-party profiling tool called OProfile.
CodeAnalyst uses Oprofile driver which is part of the Linux kernel, and
a modified Oprofile daemon which includes additional functionality such
as Java profiling. CodeAnalyst provides a graphical user interface that
communicates with Oprofile daemon and Oprofile driver. AMD CodeAnalyst
2.7 requires Linux kernel that supports Oprofile version 0.9.1 and later.