github lttng/lttng-tools v2.16.0-rc1

pre-release5 hours ago

LTTng 2.16 "Question de Perspective" Release Candidate

The LTTng team is proud to unveil the first release candidate of LTTng 2.16, codenamed "Question de Perspective"!

This release introduces maps, a brand new way to aggregate live event counters directly within a recording session, along with BLOB fields, trigger persistence across session save/load, and other thrilling refinements.

  • Maps: aggregate live event counters without recording a single event record
  • BLOB fields: record arbitrary binary data tagged with an IANA media type
  • Trigger save/load: serialize and restore triggers alongside recording session configurations
  • Up-front consumer daemons: start the consumer daemons immediately with --spawn-consumers
  • liblttng-ctl C API: new functions covering all of the above

Aggregate live event counters with the new maps feature

Maps are the main feature of LTTng 2.16: they add a new live aggregation path alongside event recording.

Maps let you count event activity live, such as how often each system call fires or which events your application emits the most, without recording a single event record. You get a high-level view of the observed system at a fraction of the resource use of recording every event:

┏━━━━━━━━┯━━━━━━━━┓
┃  Key   │ Value  ┃
┣━━━━━━━━┿━━━━━━━━┫
┃ read   │ 48,201 ┃
┃ write  │ 31,944 ┃
┃ openat │  8,127 ┃
┃ close  │  7,930 ┃
┃ mmap   │  2,615 ┃
┗━━━━━━━━┷━━━━━━━━┛

A map channel configures a set of per-CPU counters: named integer values, keyed by strings, which accumulate over time.

Unlike an event record channel (new terminology for what used to be simply called a "channel"), a map channel doesn't record individual events to ring buffers; instead, it maintains running totals that you can read at any moment.

Map channels and event record channels are independent and may coexist within the same recording session: you don't need a dedicated recording session to start counting events.

A live tally like this is a great way to guide your event recording setup: once you know the busiest events, you can pick which ones to actually record, size the buffers of your event record channels accordingly, and pinpoint the real sources of tracing overhead instead of recording blindly and sorting it out later.

Add a map channel to a recording session with the new lttng add-map-channel command:

lttng add-map-channel --type=user my-counters

Then, populate its counters with the new incr-map-value action of the lttng add-trigger command.

The example below increments a separate counter for every user space tracepoint event, using the event name as the key:

lttng add-trigger --condition=event-rule-matches \
                  --type=user --name='*' \
                  --action=incr-map-value \
                  --session=my-session \
                  --channel=my-counters --type=user \
                  --key='{event_name}:count'

The --key option is actually a template above: LTTng substitutes the {event_name} and {provider_name} placeholders for each matching event, so that a single trigger can feed many counters.

Read the current counter values with the new lttng show-maps command:

$ lttng show-maps

Recording session `my-session`:
User space map channel `my-counters`:
┏━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━┯━━━━━━━━━━━┓
┃         Key         │   Value   │ Overflow? ┃
┣━━━━━━━━━━━━━━━━━━━━━┿━━━━━━━━━━━┿━━━━━━━━━━━┫
┃ myapp:request:count │    12,043 │           ┃
┃ myapp:error:count   │        87 │           ┃
┃ mydb:query:count    │ 5,920,377 │           ┃
┗━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━┷━━━━━━━━━━━┛

Sort, filter, and slice the output in many ways, for example to show the five highest-valued counters of some process:

$ lttng show-maps --channel=counters --per=owner --pid=2860 \
                  --sort-by=value --sort-order=desc --limit=5

Recording session `my-session`:
User space map channel `counters`, process ID 2860 (`myapp`)
(64-bit int. values):
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━┯━━━━━━━━━━━┓
┃            Key            │   Value   │ Overflow? ┃
┣━━━━━━━━━━━━━━━━━━━━━━━━━━━┿━━━━━━━━━━━┿━━━━━━━━━━━┫
┃ myapp:request:count       │ 2,481,005 │           ┃
┃ myapp:cache_hit:count     │ 1,938,712 │           ┃
┃ myapp:db_query:count      │   842,330 │           ┃
┃ myapp:cache_miss:count    │   220,118 │           ┃
┃ myapp:error:count         │     1,204 │           ┃
┗━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━┷━━━━━━━━━━━┛

Or to break a single counter down for CPUs 3 and 9 of Unix user paul:

$ lttng show-maps --channel=count --per=cpu \
                  --uid=paul --cpu-id=3 --cpu-id=9 \
                  --key=myapp:db_query:count

Recording session `my-session`:
User space map channel `count`, user ID 1002 (`paul`)
(64-bit int. values), CPU 3:
┏━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━┯━━━━━━━━━━━┓
┃          Key         │  Value  │ Overflow? ┃
┣━━━━━━━━━━━━━━━━━━━━━━┿━━━━━━━━━┿━━━━━━━━━━━┫
┃ myapp:db_query:count │ 612,489 │           ┃
┗━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━┷━━━━━━━━━━━┛

Recording session `my-session`:
User space map channel `count`, user ID 1002 (`paul`)
(64-bit int. values), CPU 9:
┏━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━┯━━━━━━━━━━━┓
┃          Key         │  Value  │ Overflow? ┃
┣━━━━━━━━━━━━━━━━━━━━━━┿━━━━━━━━━┿━━━━━━━━━━━┫
┃ myapp:db_query:count │ 229,581 │           ┃
┗━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━┷━━━━━━━━━━━┛

Power users can go further. When you'd rather query the counters yourself instead of relying on the fixed, human-readable presentation of lttng show-maps, export every counter of every map channel as a self-contained SQLite script with the new lttng export-maps command. Where show-maps filters and aggregates for you, export-maps does neither: it hands you the raw data so that you can slice, filter, and aggregate it afterwards with the full power of SQL.

Pipe the script into sqlite3 and query the convenient vmap view. For example, with a shared resolver library feeding resolver:cache_* counters across every process which resolves names, rank the processes by their DNS cache hit rate, combining two different counters into a ratio that show-maps can't express:

{ lttng export-maps; cat << EOF
    SELECT owner_name AS command,
           ROUND(100.0 *
                 SUM(CASE WHEN key = 'resolver:cache_hit'
                          THEN value
                     END) /
                 SUM(value), 1) AS hit_rate_pct
    FROM vmap
    WHERE channel_name = 'my-counters'
      AND group_type = 'user-per-process'
      AND key LIKE 'resolver:cache_%'
    GROUP BY owner_id
    ORDER BY hit_rate_pct;
  EOF
  } | sqlite3 -box :memory:
╭──────────────┬──────────────╮
│   command    │ hit_rate_pct │
╞══════════════╪══════════════╡
│ nginx        │         54.8 │
│ chromium     │         71.2 │
│ postfix      │         83.0 │
│ redis-server │         94.6 │
│ sshd         │         99.1 │
╰──────────────┴──────────────╯

See the lttng export-maps manual page to discover the full SQL schema.

Other map highlights

  • A map channel may hold 32-bit or 64-bit signed integer counters, cap its number of keys, and follow a per-user or per-process buffer ownership model for user space tracing thanks to the --value-type, --max-key-count, and --buffer-ownership options of lttng add-map-channel.

  • With a per-process buffer ownership model, choose what happens to the counters of an instrumented process when it terminates while the map channel still exists with the --dead-process-policy option of lttng add-map-channel: either discard them or, by default, fold them into the shared counters of the map channel so that you don't lose any contributions.

  • The lttng list and lttng status commands now list map channels and their configuration; filter them with the new --map-channel option if needed.

  • The lttng clear command resets all the counters of a recording session to 0, without removing any allocated keys.

📖 To learn more, read:


Save and load your triggers

The lttng save and lttng load commands now serialize and restore triggers.

Saving a recording session configuration also writes every trigger which exists at saving time, and loading it recreates those triggers.

This matters for maps: because the new incr-map-value action targets a map channel, saving and loading triggers is the only way to preserve the configuration which populates the counters of a recording session. Without it, the saved configuration would have no record of which triggers feed which map channels.

Skip this behaviour with the new --no-triggers option of both commands:

lttng save my-session
lttng load my-session --no-triggers

lttng load doesn't restore the original owner of a loaded trigger: the Unix user running the command becomes the owner of every loaded trigger, regardless of which Unix user owned it at saving time.

📖 To learn more, read:


Record raw binary data with BLOB fields

The new BLOB field instrumentation support of LTTng-UST records an arbitrary sequence of bytes tagged with an IANA media type, for example image/jpeg. Previously, LTTng-UST could only capture binary data as an array of integers, with no way to tag its type.

The new lttng_ust_field_fixed_length_blob() and lttng_ust_field_variable_length_blob() field macros do this:

lttng_ust_field_variable_length_blob(payload, buf, size_t, len,
                                     "image/jpeg")

LTTng records the bytes as is, without interpreting them: think raw network data, memory regions, binary protocol data, or whole files.

CTF 2 supports the BLOB field class as is, including the IANA media type. For CTF 1.8, which has no equivalent field class, LTTng records the bytes as a best-effort array field of 8-bit unsigned integers; this loses the media type as CTF 1.8 has no way to encode it.

📖 To learn more, read the lttng-ust(3) manual page.


Start the consumer daemons up front

Pass the new --spawn-consumers option to the lttng-sessiond(8) command to start the consumer daemons immediately instead of having the session daemon spawn them on first need.

This removes the one-time delay of spawning them on first need from your first recording session, which is handy when the earliest events matter or when you're measuring startup latency.


Control the new features from the liblttng-ctl C API

The liblttng-ctl C API gains functions for all of the above:

  • Add and enumerate map channels with lttng_session_add_map_channel() and lttng_session_list_map_channels().

  • Configure and inspect map channels and their keys with the lttng_map_channel_*(), lttng_map_group_*(), and lttng_map_key_*() function families.

  • Build the new "increment map value" trigger action with the lttng_action_increment_map_value_*() function family, along with string key template utilities.

  • Capture and filter on BLOB event fields from triggers with the new LTTNG_EVENT_FIELD_VALUE_TYPE_BLOB event field value type. Get the IANA media type, length, and data of a captured BLOB field with lttng_event_field_value_blob_get_media_type(), lttng_event_field_value_blob_get_length(), and lttng_event_field_value_blob_get_data().

  • lttng_register_consumer() is deprecated and now returns LTTNG_ERR_NOT_SUPPORTED.

📖 To learn more, read the liblttng-ctl C API documentation.


Version Name

This release is named after "Question de Perspective", a stout brewed by Ayawan in Val-Morin that the team adopted for its name. It is a thoroughly standard stout: dark and roasty, faithful to the expectations of the style, and remarkable mostly for how unremarkably it conforms to them. An std::stout, if you will. Whether that counts for or against it is, fittingly, a question of perspective.


Important Links

Downloads

Resources

Resource Link
LTTng website https://lttng.org/
LTTng 2.16 documentation https://lttng.org/docs/v2.16
Mailing list https://lists.lttng.org/
IRC channel #lttng on irc.oftc.net
Bug tracker https://bugs.lttng.org/projects/lttng/
GitHub organization https://github.com/lttng

Continuous Integration

Code Review

Don't miss a new lttng-tools release

NewReleases is sending notifications on new releases.