- #1547: Native lock profiling
- #1566: Filter cpu/wall profiles by latency
- #1568: Expose async-profiler metrics in Prometheus format
- #1628: async-profiler.jar as Java agent; remote control via JMX
- #1140: FlameGraph improvements: legend, hot keys, new toolbar icons
- #1530: Timezone switcher between Local and UTC time in Heatmaps
- #1582: Support
--include/--excludeoptions for JFR to Heatmap/OTLP/pprof conversion - #1624: Compatibility with OTLP v1.9.0
- #1629: Harden crash protection in StackWalker
- #1277: New
timeSpanfield in WallClockSample events - #1518: Deprecate
checkcommand - #1590: Support compilation on modern JDKs. Drop JDK 7 support
- #1599: Workaround for the kernel PERF_EVENT_IOC_REFRESH bug
- #1596: Do not block any signals during execution of a custom crash handler
- #1584: JfrReader loops on corrupted recordings
- #1555: Parse FlameGraph title from HTML input
- #1621:
loopandtimeoutoptions do not work together - #1641: Unwind vDSO correctly on Linux-ARM64
- #1648: Fix stop sequence in Profiler::start
- #1575: Fix CodeCache memory leak in lock profiling while looping
- #1558: Fix record-cpu bug when kernel stacks are not available
- #1651: Do not record CPU frame for non-perf samples
- #1614, #1615, #1617, #1623: Fix races related to VM termination
- #1599: Workaround for the kernel PERF_EVENT_IOC_REFRESH bug
- #1596: Do not block any signals during execution of a custom crash handler
- Java Method Tracing and Latency Profiling
- #1421: Latency profiling
- #1435: Allow wildcards in Instrument profiling engine
- #1499:
--traceoption with per-method latency threshold
- System-wide process sampling on Linux
- #1411:
--procoption to recordprofiler.ProcessSampleevents
- #1411:
- VMStructs stack walker by default
- #1539: Use VMStructs stack walking mode by default
- #1537: Support
comptaskandvtablefeatures - #1517: Use JavaFrameAnchor to find top Java frame
- #1449: Special handling of prologue and epilogue of compiled methods
- #1475: Add
CPUTimeSampleevent support to jfrconv - #1414: Per-thread flamegraph option in JFR heatmap converter
- #1526: Expose JfrReader dictionary that maps osThreadId to javaThreadId
- #1448: Thread name in OpenTelemetry output
- #1413: Add
time_nanosandduration_nanosto OTLP profiles - #1450: Unwind dylib stubs as empty frames on macOS
- #1416: Add synthetic symbols for Mach-O stubs/trampolines
- Allow cross-compilation for 32-bit platforms
- #1515: Fix UnsatisfiedLinkError when tmpdir is set to a relative path
- #1500: Detect if
calloccallsmallocfor nativemem profiling - #1427: Re-implement SafeAccess crash protection
- #1417: Two wall-clock profilers interfere with each other
- #1527: GHA: replace macos-13 with macos-15-intel
- #1510: Add option to retry tests
- #1508: Add more GHA jobs to cover JDK versions on ARM
- #1502: Fix job dependencies between integration tests and builds
- #1466: Add Liberica JDK on Alpaquita Linux to the CI
- Made integration tests more stable overall
- Experimental support for the OpenTelemetry profiling signal
- #1188: OTLP output format and
dumpOtlpJava API - #1336: JFR to OTLP converter
- #1188: OTLP output format and
- JDK 25 support
- #1222: Update VMStructs for JDK 25
- Productize native memory profiling
- #1193: Full
nativememsupport on macOS - #1254: Fixed Nativemem tests on Alpine
- #1269: Native memory profiling now works with
jemalloc - #1323:
nativememshows allocations inside async-profiler itself
- #1193: Full
- #1174: Detect JVM in non-Java application and attach to it
- #1223: Native API to add custom events in JFR recording
- #1259:
--alloption to collect all possible events simultaneously - #1286: Record which CPU a sample was taken on
- #1299: Skip last 10% allocations for leak detection
- #1300: Allow profiling kprobes/uprobes with
--fdtransfer - #1366: Rewrite
jfrconvexecutable to shell - #1400: Unwind checksum and digest intrinsics on ARM64
- #1357, #1389: VMStructs-based stack unwinding for
allocandnativememprofiling
- #1251:
--ttspoption does not work on Alpine - #1264: Guard hook installation with dlopen/dlclose
- #1319: SIGSEGV in PerfEvents::walk
- #1350: Disable JFR OldObjectSample event in jfrsync mode
- #1358: Do not dereference jmethodIDs on JDK 26
- #1374: Correctly check if profiler is preloaded
- #1380: Workaround clang type promotion bug
- #1387: JFR writer crashes when using cstack=vmx
- #1393: Improve stack walking termination logic: no endless
unknownframes - Stack unwinding fixes for ARM64
- #1129: Command-line option to filter tests
- #1262: Include
asprof.hin async-profiler release package - #1271: Release additional binaries with debug symbols
- #1274: Add Corretto 8 to the test matrix
- #1246, #1226: Run tests on Amazon Linux and Alpine Linux
- #1360: Auto-generated clang-tidy review comments
- #1373: Save all generated test logs for debug purposes
- Fixed flaky tests (#1282, #1307, #1376)
- #895, #905:
jfrconvbinary and numerous converter enhancements - #944: Interactive Heatmap
- #1064: Native memory leak profiler
- #1002: An option to display instruction addresses
- #1007: Optimize wall clock profiling
- #1073: Productize VMStructs-based stack walker:
--cstack vm/vmx - #1169: C API for accessing thread-local profiling context
- #923: Support JDK 23+
- #952: Solve musl and glibc compatibility issues; link
libstdc++statically - #955:
--libpathoption to specify path tolibasyncProfiler.soin a container - #1018:
--grainconverter option to coarsen flame graphs - #1046:
--nostopoption to continue profiling outside--begin/--endwindow - #1178:
--invertedoption to flip flame graphs vertically - #1009: Allows collecting allocation and live object traces at the same time
- #925: An option to accumulate JFR events in memory instead of flushing to a file
- #929: Load symbols from debuginfod cache
- #982: Sample contended locks by overflowing interval bucket
- #993: Filter native frames in allocation profile
- #896: FlameGraph:
Alt+Clickto remove stacks - #1097: FlameGraph:
N/Shift+Nto navigate through search results - #1182: Retain by-thread grouping when reversing FlameGraph
- #1167: Log when no samples are collected
- #1044: Fall back to
ctimerfor CPU profiling when perf_events are unavailable - #1068: Count missed samples when estimating total CPU time in
ctimermode - #1142: Use counter-timer register for timestamps on ARM64
- #1123: Support
clock=tscwithout a JVM - #1070: Demangle Rust v0 symbols
- #1007: Use
ExecutionSampleevent for CPU profiling andWallClockSamplefor Wall clock profiling - #1011: Obtain
can_generate_sampled_object_alloc_eventsJVMTI capability only when needed - #1013: Intercept java.util.concurrent locks more efficiently
- #759: Discover available profiling signal automatically
- #884: Record event timestamps early
- #885: Print error message if JVM fails to load libasyncProfiler
- #892: Resolve tracepoint id in
asprof - Suppress dynamic attach warning on JDK 21+
- #1143: Crash on macOS when using thread filter
- #1125: Fixed parsing concurrently loaded libraries
- #1095: jfr print fails when a recording has empty pools
- #1084: Fixed Logging related races
- #1074: Parse both .rela.dyn and .rela.plt sections
- #1003: Support both tracefs and debugfs for kernel tracepoints
- #986: Profiling output respects loglevel
- #981: Avoid JVM crash by deleting JNI refs after
GetMethodDeclaringClass - #934: Fix crash on Zing in a native thread
- #843: Fix race between parsing and concurrent unloading of shared libraries
- #1147, #1151: Deadlocks with jemalloc and tcmalloc profilers
- Stack walking fixes for ARM64
- Converter fixes for
jfrsyncprofiles - Fixed parsing non-PIC executables and shared objects with non-standard section layout
- Fixed recursion in
pthread_createwhen using native profiling API - Fixed crashes on Alpine when profiling native apps
- Fixed warnings with
-Xcheck:jni - Fixed "Unsupported JVM" on OpenJ9 JDK 21
- Fixed DefineClass crash on OpenJ9
- JfrReader should handle custom events properly
- Handle truncated JFRs
- Restructure and update documentation
- Implement test framework; add new integration tests
- Unit test framework for C++ code
- Run CI on all supported platforms
- Test multiple JDK versions in CI
- Add GHA to validate license headers
- Add Markdown checker and formatter
- Add Issue and Pull Request templates
- Add Contributing Guidelines and Code of Conduct
- Run static analyzer and fix found issues (#1034, #1039, #1049, #1051, #1098)
- Provide Dockerfile for building async-profiler release packages
- Publish nightly builds automatically
- #724: Binary launcher
asprof - #751: Profile non-Java processes
- #795: AsyncGetCallTrace replacement
- #719: Classify execution samples into categories in JFR converter
- #855:
ctimermode for accurate profiling without perf_events - #740: Profile CPU + Wall clock together
- #736: Show targets of vtable/itable calls
- #777: Show JIT compilation task
- #644: RISC-V port
- #770: LoongArch64 port
- #733: Make the same
libasyncProfilerwork with both glibc and musl - #734: Support raw PMU event descriptors
- #759: Configure alternative profiling signal
- #761: Parse dynamic linking structures
- #723:
--clockoption to select JFR timestamp source - #750:
--jfrsyncmay specify a list of JFR events - #849: Parse concatenated multi-chunk JFRs
- #833: Time-to-safepoint JFR event
- #832: Normalize names of hidden classes / lambdas
- #864: Reduce size of HTML Flame Graph
- #783: Shutdown asprof gracefully on SIGTERM
- Better demangling of C++ and Rust symbols
- DWARF unwinding for ARM64
JfrReadercan parse in-memory buffer- Support custom events in
JfrReader - An option to read JFR file by chunks
- Record
GCHeapSummaryevents in JFR
- Workaround macOS crashes in SafeFetch
- Fixed attach to OpenJ9 on macOS
- Support
UseCompressedObjectHeadersaka Lilliput - Fixed allocation profiling on JDK 20.0.x
- Fixed context-switches profiling
- Prefer ObjectSampler to TLAB hooks for allocation profiling
- Improved accuracy of ObjectSampler in
--totalmode - Make Flame Graph status line and search results always visible
loopandtimeoutoptions did not work in some modes- Restart interrupted poll/epoll_wait syscalls
- Fixed stack unwinding issues on ARM64
- Workaround for stale jmethodIDs
- Calculate ELF base address correctly
- Do not dump redundant threads in a JFR chunk
checkaction prints result to a file- Annotate JFR unit types with
@ContentType
- Java Heap leak profiler
meminfocommand to print profiler's memory usage- Profiler API with embedded agent as a Maven artifact
--include/--excludeoptions in the FlameGraph converter--simpleand--dotoptions in jfr2flame converter- An option for agressive recovery of
[unknown_Java]stack traces - Do not truncate signatures in collapsed format
- Display inlined frames under a runtime stub
- Profiler did not work with Homebrew JDK
- Fixed allocation profiling on Zing
- Various
jfrsyncfixes - Symbol parsing fixes
- Attaching to a container on Linux 3.x could fail
- Support virtualized ARM64 macOS
- A switch to generate auxiliary events by async-profiler or FlightRecorder in jfrsync mode
- Could not recreate perf_events after the first failure
- Handle different versions of Zing properly
- Do not call System.loadLibrary, when libasyncProfiler is preloaded
- The same .so works with glibc and musl
- dlopen hook did not work on Arch Linux
- Fixed JDK 7 crash
- Fixed CPU profiling on Zing
- Mark interpreted frames with
_[0]in collapsed output - Double click selects a method name on a flame graph
- JFR to pprof converter (contributed by @NeQuissimus)
- JFR converter improvements: time range, collapsed output, pattern highlighting
%npattern in file names; limit number of output files--libto customize profiler library path in a containerprofiler.sh listcommand now works without PID
- Fixed crashes related to continuous profiling
- Fixed Alpine/musl compatibility issues
- Fixed incomplete collapsed output due to weird locale settings
- Workaround for JDK-8185348
- Mark top methods as interpreted, compiled (C1/C2), or inlined
- JVM TI based allocation profiling for JDK 11+
- Embedded HTTP management server
- Re-implemented stack recovery for better reliability
- Add
loglevelargument - Do not mmap perf page in
--all-usermode - Distinguish runnable/sleeping threads in OpenJ9 wall-clock profiler
--cpuconverter option to extract CPU profile from the wall-clock output
- Experimental support for OpenJ9 VM
- DWARF stack unwinding
- Better handling of VM threads (fixed missing JIT threads)
- More reliable recovery from
not_walkableAGCT failures - Do not accept unknown agent arguments
- Continuous profiling;
loopandtimeoutoptions
- Reliability improvements: avoid certain crashes and deadlocks
- Smaller and faster agent library
- Minor
jfrandjfrsyncenhancements (see the commit log)
- Prevent early unloading of libasyncProfiler.so
- Read kernel symbols only for perf_events
- Escape backslashes in flame graphs
- Avoid duplicate categories in
jfrsyncmode - Fixed stack overflow in RedefineClasses
- Fixed deadlock when flushing JFR
- Support OpenJDK C++ Interpreter (aka Zero)
- Allow reading incomplete JFR recordings
- macOS/ARM64 (aka Apple M1) port
- PPC64LE port (contributed by @ghaug)
- Profile low-privileged processes with perf_events (contributed by @Jongy)
- Raw PMU events; kprobes & uprobes
- Dump results in the middle of profiling session
- Chunked JFR; support JFR files larger than 2 GB
- Integrate async-profiler events with JDK Flight Recordings
- Use RDTSC for JFR timestamps when possible
- Show line numbers and bci in Flame Graphs
- jfr2flame can produce Allocation and Lock flame graphs
- Flame Graph title depends on the event and
--total - Include profiler logs and native library list in JFR output
- Lock profiling no longer requires JVM symbols
- Better container support
- Native function profiler can count the specified argument
- An option to group threads by scheduling policy
- An option to prepend library name to native symbols
- macOS build is provided as a fat binary that works both on x86-64 and ARM64
- 32-bit binaries are no longer shipped. It is still possible to build them from sources
- Dropped JDK 6 support (may still work though)
- Profile multiple events together (cpu + alloc + lock)
- HTML 5 Flame Graphs: faster rendering, smaller size
- JFR v2 output format, compatible with FlightRecorder API
- JFR to Flame Graph converter
- Automatically turn profiling on/off at
--begin/--endfunctions - Time-to-safepoint profiling:
--ttsp
- Unlimited frame buffer. Removed
-boption and 64K stack traces limit - Additional JFR events: OS, CPU, and JVM information; CPU load
- Record bytecode indices / line numbers
- Native stack traces for Java events
- Improved CLI experience
- Better error handling; an option to log warnings/errors to a dedicated stream
- Reduced the amount of unknown stack traces
- Removed non-ASL code. No more CDDL license
- Smaller and faster agent library
- Fixed JDK 7 crash during wall-clock profiling
- libasyncProfiler.dylib symlink on macOS
- Fixed possible deadlock on non-HotSpot JVMs
- Gracefully stop profiler when terminating JVM
- Fixed GetStackTrace problem after RedefineClasses
- AArch64 build is now provided out of the box
- Compatibility with JDK 15 and JDK 16
- More careful native stack walking in wall-clock mode
resumecommand is not compatible with JFR format- Wrong allocation sizes on JDK 8u262
- Possibility to specify application name instead of
pid(contributed by @yuzawa-san)
- Fixed long attach time and slow class loading on JDK 8
UnsatisfiedLinkErrorduring Java method profiling- Avoid reading
/proc/kallsymswhen--all-useris specified
- Converters between different output formats:
- JFR -> nflx (FlameScope)
- Collapsed stacks -> HTML 5 Flame Graph
profiler.shno longer requires bash (contributed by @cfstras)- Fixed long attach time and slow class loading on JDK 8
- Fixed deadlocks in wall-clock profiling mode
- Per-thread reverse Flame Graph and Call Tree
- ARM build now works with ARM and THUMB flavors of JDK
- Release package is extracted into a separate folder
- LBR call stack support (available since Haswell)
--filterto profile only specified thread IDs in wall-clock mode--safe-modeto disable selected stack recovery techniques
- Profile invocations of arbitrary Java methods
- Filter stack traces by the given name pattern
- Java API to filter monitored threads
--cstack/--no-cstackoption
- Thread names and Java thread IDs in JFR output
- Wall clock profiler distinguishes RUNNABLE vs. SLEEPING threads
- Stable profiling interval in wall clock mode
- C++ function names as events, e.g.
-e VMThread::execute checkcommand to test event availability- Allow shading of AsyncProfiler API
- Enable CPU profiling on WSL
- Enable allocation profiling on Zing
- Reduce the amount of
unknown_Javasamples
- Pause/resume profiling
- Allocation profiling support for JDK 12, 13 (contributed by @rraptorr)
- Include all AsyncGetCallTrace failures in the profile
- Parse symbols of JNI libraries loaded in runtime
- The agent autodetects output format by the file extension
- Output file name patterns:
%pand%t -goption to print method signatures-jcan increase the maximum Java stack depth- Allocaton sampling rate can be adjusted with
-i - Improved reliability on macOS
-ffile names are now relative to the current shell directory
- Wall-clock profiler:
-e wall -e itimermode for systems that do not support perf_events- Native stack traces on macOS
- Support for Zing runtime, except allocation profiling
--all-useroption to allow profiling with restrictedperf_event_paranoid(contributed by @jpbempel)-aoption to annotate method names- Improved attach to containerized and chroot'ed JVMs
- Native function profiling now accepts non-public symbols
- Better mapping of Java thread names (contributed by @KirillTim)
- Changed default profiling engine on macOS
- Fixed the order of stack frames in JFR format
- Interactive Call tree and Backtrace tree in HTML format (contributed by @rpulle)
- Experimental support for Java Flight Recorder (JFR) compatible output
- Added units:
ms,us,sand multipliers:K,M,Gfor interval argument - API and command-line option
-vfor profiler version - Allow profiling containerized JVMs on older kernels
- Default CPU sampling interval reduced to 10 ms
- Changed the text format of flat profile
- Profiling of native functions, e.g. malloc
- JDK 9, 10, 11 support for heap profiling with accurate stack traces
rootcan now profile Java processes of any user-joption for limiting Java stack depth
- Produce SVG files out of the box; flamegraph.pl is no longer needed
- Profile ReentrantLock contention
- Java API
- Allocation and Lock profiler now works on JDK 7, too
- Faster dumping of results
totalcounter of allocation profiler now measures heap pressure (like JMC)
- Linux Perf Events profiling: CPU cycles, cache misses, branch misses, page faults, context switches etc.
- Kernel tracepoints support
- Contended monitor (aka intrinsic lock) profiling
- Individual thread profiles
- Profiler can engage at JVM start and automatically dump results on exit
listcommand-line option to list supported events- Automatically find target process ID with
jpstool - An option to include counter value in
collapsedoutput - Friendly class names in allocation profile
- Split allocations in new TLAB vs. outside TLAB
- Replaced
-mmodes with-eevents - Interval changed from
inttolong
- CPU profiler without Safepoint bias
- Lightweight Allocation profiler
- Java, native and kernel stack traces
- FlameGraph compatible output