|
This article or section has multiple issues. Please help improve the article or discuss these issues on the talk page.
|
In software engineering, performance analysis, more commonly known today as profiling, is the investigation of a program's behavior using information gathered as the program executes (i.e. it is a form of dynamic program analysis, as opposed to static code analysis). The usual goal of performance analysis is to determine which sections of a program to optimize — usually either to increase its speed or decrease its memory requirement (or sometimes both).
Contents |
A profiler is a performance analysis tool that measures the behavior of a program as it executes, particularly the frequency and duration of function calls. (An instruction set simulator which is also — by necessity — a profiler, can measure the totality of a program's behaviour from invocation to termination.) The output may be a stream of recorded events (a trace) or a statistical summary of the events observed (a profile) or an ongoing interaction with the hypervisor. Profilers use a wide variety of techniques to collect data, including hardware interrupts, code instrumentation, instruction set simulation, operating system hooks, and performance counters. The usage of profilers is 'called out' in the performance engineering process.
As the summation in a profile often is done related to the source code positions where the events happen, the size of measurement data is linear to the code size of the program. In contrast, the size of a (full) trace is linear to the program's execution time, making it somewhat impractical. For sequential programs, a profile is usually enough, but performance problems in parallel programs (waiting for messages or synchronization issues) often depend on the time relationship of events, thus requiring the full trace to get an understanding of the problem.
Performance analysis tools existed on IBM/360 and IBM/370 platforms from the early 1970s, usually based on timer interrupts which recorded the Program status word (PSW) at set timer intervals to detect "hot spots" in executing code. This was an early example of sampling (see below). In early 1974, Instruction Set Simulators permitted full trace and other performance monitoring features.
Profiler-driven program analysis on Unix dates back to at least 1979, when Unix systems included a basic tool "prof" that listed each function and how much of program execution time it used. In 1982, gprof extended the concept to a complete call graph analysis [1]
In 1994, Amitabh Srivastava and Alan Eustace of Digital Equipment Corporation published a paper describing ATOM [2]. ATOM is a platform for converting a program into its own profiler. That is, at compile time, it inserts code into the program to be analyzed. That inserted code outputs analysis data. This technique, modifying a program to analyze itself, is known as "instrumentation".
In 2004, both the Gprof and ATOM papers appeared on the list of the 50 most influential PLDI papers of all time. [3]
Flat profilers compute the average call times, from the calls, and do not breakdown the call times based on the callee or the context.
Call graph profilers show the call times, and frequencies of the functions, and also the call-chains involved based on the callee. However context is not preserved.
The programming languages listed here have event-based profilers:
Some profilers operate by sampling. A sampling profiler probes the target program's program counter at regular intervals using operating system interrupts. Sampling profiles are typically less numerically accurate and specific, but allow the target program to run at near full speed.
The resulting data are not exact, but a statistical approximation. The actual amount of error is usually more than one sampling period. In fact, if a value is n times the sampling period, the expected error in it is the square-root of n sampling periods. [4]
In practice, sampling profilers can often provide a more accurate picture of the target program's execution than other approaches, as they are not as intrusive to the target program, and thus don't have as many side effects (such as on memory caches or instruction decoding pipelines). Also since they don't affect the execution speed as much, they can spot issues that would otherwise be hidden. They are also relatively immune to over-evaluating the cost of small, often called routines or tight loops. They can show the relative amount of time spent in user mode versus interruptible kernel mode such as system call processing.
Still, kernel code to handle the interrupts entails a minor loss of cpu cycles, diverted cache usage, and is unable to distinguish the various tasks occurring in uninterruptible kernel code (microsecond-range activity).
Dedicated hardware can go beyond this: some recent MIPS processors JTAG interface have a PCSAMPLE register, which samples the program counter in a truly undetectable manner.
Some of the most commonly used statistical profilers are AMD CodeAnalyst, Apple Inc. Shark, gprof, Intel VTune and Parallel Amplifier (part of Intel Parallel Studio).
Some profilers instrument the target program with additional instructions to collect the required information.
Instrumenting the program can cause changes in the performance of the program, potentially causing inaccurate results and heisenbugs. Instrumenting will always have some impact on the program execution, typically it slows the execution. However, instrumentation can be very specific and be carefully controlled to have a minimal impact. The impact on a particular program depends on the placement of instrumentation points and the mechanism used to capture the trace. Hardware support for trace capture means that on some targets, instrumentation can be one machine instruction. The impact of instrumentation can often be deducted from the results.
gprof is an example of a profiler that uses both instrumentation and sampling. Instrumentation is used to gather caller information and the actual timing values are obtained by statistical sampling.
stock | retire | vm
Why are we here?
All text is available under the terms of the GNU Free Documentation License
This page is cache of Wikipedia. History