Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Error Reporting System

nvlink's diagnostic infrastructure routes every warning, error, and fatal message through a two-function pipeline: diag_emit (sub_467460) is the variadic entry point called from ~200 sites across the binary, and diag_format (sub_467A70) is the heavy formatter that renders severity prefixes, source locations, ANSI color tokens, multi-line alignment, and optional source-code snippets. Both functions dispatch through a per-thread state block obtained via tls_get_state (sub_44F410), a pthread TLS system that allocates a 0x118-byte structure per thread. Fatal-severity messages terminate the process via longjmp back to the pipeline's error-recovery point, or through fatal_exit (sub_44A440) which calls abort() under --trap-into-debugger and exit(1) otherwise. Diagnostic descriptors are statically allocated in a BSS table of 88 confirmed 16-byte entries spanning 0x2A5B530--0x2A5BB70, each encoding a severity level and a printf-style format string pointer. An additional 5 BSS objects in the same region serve as error-status return values or suppression bitmaps rather than diagnostic descriptors.

Key Functions

AddressNameSizeRole
sub_467460diag_emit1,552 BVariadic entry point; dispatches descriptor + va_args to formatter
sub_467A70diag_format13,105 BFull formatter with location, color, source snippets, output routing
sub_44F410tls_get_state432 BPthread TLS; allocates 0x118-byte per-thread diagnostic/sync state
sub_44A030diag_fprintf48 BSimple vfprintf to diagnostic stream (stderr or redirected)
sub_44A010diag_vfprintf48 Bvfprintf variant taking pre-built va_list
sub_44A0E0diag_write_buffered336 BBuffered output with per-line callback dispatch
sub_44A260diag_printf48 BVariadic wrapper around diag_write_buffered
sub_44A440fatal_exit16 BFatal handler: abort() if debugger trap, else exit(1)
sub_44A410fatal_dispatch8 BIndirect call through fatal handler function pointer
sub_44A420exit_dispatch8 BIndirect call through exit handler function pointer
sub_44A3F0install_handlers32 BInstalls custom fatal/exit handler function pointers
sub_44A000set_diag_stream8 BSets qword_2A5F3A0 (diagnostic output FILE*)
sub_458000assertion_failure64 BPrints "Assertion failure at %s, line %d: " and message
sub_457FF0set_assert_location16 BStores file and line for next assertion failure
sub_42FA60set_trap_flag8 BSets byte_2A5F358 = 1 (--trap-into-debugger)
sub_45CAC0fatal_alloc16 BEmits fatal via &unk_2A5BB70 (out-of-memory descriptor)
sub_4684A0has_color_support16 BReturns tls_state[51] (ANSI color enable flag)
sub_44FB20strbuf_create96 BAllocates 40-byte string buffer for message assembly
sub_44FDC0strbuf_finalize160 BFlattens string buffer chain into a single C string
sub_44FE60strbuf_append_str288 BAppends a string to the buffer chain
sub_44FF90strbuf_append_char304 BAppends a single character to the buffer
sub_4500A0strbuf_vsprintf496 Bvsnprintf into string buffer (1024-byte stack fast path)
sub_450280strbuf_sprintf32 BVariadic wrapper around strbuf_vsprintf
sub_4504A0strbuf_length8 BReturns *(qword*)(buf + 8) -- current buffer length
sub_44FD40strbuf_snapshot112 BSnapshots buffer content without consuming it
sub_45B7B0channel_write192 BMulti-mode output: string buffer, FILE*, sprintf, stdout
destr_functiontls_destructor212 BPthread key destructor; tears down per-thread state

Severity Levels

The first DWORD of each diagnostic descriptor (*a1) encodes the severity. diag_emit reads this field and dispatches accordingly. Six levels are defined:

LevelValuePrefix StringANSI TokenOutput ChannelEffect
Note0(none)----Silent; diag_emit returns immediately
Info1"" (empty)@I@qword_2A5F8A0[0] (info)Informational; suppressed by --disable-infos
Info (labeled)2"info "@O@qword_2A5F8A0[0] (info)Same as Info but with explicit prefix
Warning3"warning "@W@qword_2A5F8A0[1] (warning)May be promoted to error by -Werror; suppressed by -w
Error (soft)4"error* "off_1D3B7A8qword_2A5F8A0[1] (warning)Records error flag but does not terminate
Error5"error "off_1D3B7A8qword_2A5F8A0[2] (error)Records error flag; sets "had errors" state
Fatal6"fatal "off_1D3B7A8qword_2A5F8A0[4] (fatal)Terminates via longjmp or fatal_exit

The off_1D3B7A8 pointer references the ANSI color-reset/error-color token (red), shared by error* / error / fatal.

Severity Dispatch Logic

diag_emit(descriptor, ...):
    severity = descriptor->level          // DWORD at offset 0
    if descriptor->suppressed:            // BYTE at offset 4
        return

    if severity == 3 (warning):
        if tls_state[49]:                 // warnings suppressed (-w)
            return
        if tls_state[50]:                 // -Werror active
            severity = 5                  // promote to error
            prefix = "error   "

    if severity == 2 (info):
        if tls_state[48]:                 // infos suppressed (--disable-infos)
            return

    if severity == 0 (note):
        return

Warning-as-Error Promotion

When -Werror is active (tls_state[50] != 0), any warning (severity 3) is promoted to severity 5 (hard error). The prefix changes from "warning " to "error ". This applies in both diag_emit and diag_format. The promotion is per-emission, not per-descriptor -- the descriptor itself remains severity 3 so that the same diagnostic can be a warning in one invocation and an error in another.

Diagnostic Descriptor Table

Each diagnostic site calls diag_emit with a pointer to a static descriptor object as the first argument. These descriptors live in the BSS region at addresses 0x2A5B530 through 0x2A5BB70, spaced at 16-byte intervals. 88 unique diagnostic descriptors have been confirmed across 380+ call sites in the decompiled codebase, referenced from 257 distinct functions.

Descriptor Structure

DiagDescriptor (16 bytes)
=======================================================
Offset  Size  Field         Description
-------------------------------------------------------
  0      4    level         Severity level (0-6, see table above)
  4      1    suppressed    If non-zero, diag_emit returns immediately
  5      3    padding
  8      8    format_str    Pointer to printf-style format string

The level field is read as *a1 (the first DWORD). The suppressed field is checked as *(BYTE*)(a1 + 4). The format string at offset 8 is passed as the format argument to vsnprintf/vfprintf during message assembly.

Complete Descriptor Catalog

98 unique descriptor addresses have been identified across the BSS range 0x2A5B500--0x2A5BFE0. Of these, 88 are true diagnostic descriptors passed to diag_emit/diag_format, 3 are error-status objects used as return values (unk_2A5BFC0, unk_2A5BFD0, unk_2A5BFE0), and the remainder serve as BSS anchors for __cxa_atexit or warning-suppression bitmaps (unk_2A5B500, unk_2A5BB80).

The tables below are organized by subsystem. Severity is inferred from diag_emit dispatch logic: descriptors whose callers continue execution after the call are Error (5) or Warning (3); descriptors whose callers never reach the next instruction (the call site is the function's last reachable point) are Fatal (6). The Sites column counts the number of distinct sub_467460 call sites referencing each descriptor.

Input Processing and File I/O (sub_427A10, sub_42AF40, main)

AddressSevSitesUsage ContextExample Message
unk_2A5B55052Input file not found (LTO module add)"%s" (file path)
unk_2A5B5B031Input not a device ELF and not cudadevrt"%s" (file name)
unk_2A5B70053Required input file cannot be opened"%s" (file path)
unk_2A5B710511Output file cannot be opened for writing (fopen failure)"%s" (file path)
unk_2A5B73032Input file fails magic/format validation"%s" (file path)
unk_2A5B89052Timing CSV file cannot be opened"%s" (file path)

Architecture Validation (sub_426570, sub_42A2D0)

AddressSevSitesUsage ContextExample Message
unk_2A5B5C031Input ELF ISA version mismatch (non-fatal, older ABI)"%s" (input file)
unk_2A5B61051Architecture mismatch between input files"%s" (file), %s (arch)
unk_2A5B62031sm_90 required but input built for sm <= 0x77"%s" (input file)
unk_2A5B63031sm_50 required but input built for sm <= 0x40"%s" (input file)
unk_2A5B64032Input ISA class exceeds max supported by target arch"%s" (file), %d (class), %d (max)
unk_2A5B66053Unsupported architecture for non-cudadevrt input%s (arch), "%s" (file)
unk_2A5B68034Address size mismatch (e_machine != EM_CUDA)"%s" (input file)
unk_2A5B6903132/64-bit address mode mismatch between files"%s" (file), "%s" (-m32/-m64)
unk_2A5B6A031Target arch not supported by input ELF"%s" (file), %s (arch)
unk_2A5B6B031ELF class mismatch (e.g., ELFCLASS32 vs ELFCLASS64)"%s" (file), %d (class), %d (expected)
unk_2A5B6E031Arch requires split-compile but was disabled; forces mode%d (arch)

CLI Option Parsing (sub_427AE0, sub_429BA0, sub_42BC30--sub_42DBC0)

AddressSevSitesUsage ContextExample Message
unk_2A5B53031Arch > 0x48 with 32-bit address mode unsupported%d (arch code)
unk_2A5B54032Conflicting split-compile options (e.g., -emit-ptx + -split-compile-extended)"%s", "%s", "%s" (option names)
unk_2A5B56033Ignored option due to conflict (e.g., -use-host-info with -kernels-used)"%s" (message)
unk_2A5B57031Missing required prerequisite flag for arch"%s" (arch string)
unk_2A5B58031-fdcmpt option no longer supported"%s" (option)
unk_2A5B59031--preserve-relocs option used in non-applicable context"%s" (option)
unk_2A5B5A032Deprecated option used (-fdcmpt, -m32)"%s" (option)
unk_2A5B5D031LTO option mismatch across input files; disabling LTO"%s" (context)
unk_2A5B5E031Conflicting -maxrregcount values across files"%s" (option), "%s" (option), %d (value1), %d (value2)
unk_2A5B5F031Conflicting -lineinfo settings across files"%s" (option), "%s" (option)
unk_2A5B60036Per-file option conflict (-ftz, -prec-div, -prec-sqrt, -fmad, -maxrregcount, -split-compile)"%s" (option name)
unk_2A5B65035Option requires -dlto or -g not present"%s" (option), "%s" (required flag)
unk_2A5B6D031Arch code too low (sm <= 0x59, pre-Volta)%d (arch code)
unk_2A5B6F031Target GPU ISA version too old (below 0x13)%s (arch string)
unk_2A5B72035Option value overridden or conflicting"%s" (option name)
unk_2A5B74051No output file specified and not in dry-run mode"%s" (option context)
unk_2A5B75052Final output file cannot be written or linked"%s" (file path)
unk_2A5B76031Implicit arch selection when explicit arch given"%s" (arch string)
unk_2A5B77031No output path specified"%s" (context)

Option Value Parsing (sub_42BC30--sub_42DBC0)

AddressSevSitesUsage ContextExample Message
unk_2A5B780518Numeric parse failure for option value"32-bit integer", "64-bit integer", "32-bit hex", "64-bit hex", "32-bit unsigned integer", "64-bit unsigned integer"
unk_2A5B79031Response file nesting exceeds limit (>14 levels)"%s" (file path)
unk_2A5B7A031Unrecognized option in linker script"%s" (option name)
unk_2A5B7B051Response file cannot be opened for reading"%s" (file path)
unk_2A5B7C052Duplicate symbol definition in global scope"%s" (symbol name)
unk_2A5B7D056Option value out of allowed range"%s" (option), %d (value)
unk_2A5B7E031Option value not in allowed set"%s" (value), "%s" (option)
unk_2A5B7F031Linker script command argument count mismatch%d (count)
unk_2A5B800510Trailing characters after numeric option value"%s" (raw value string)
unk_2A5B81052Option type mismatch (expected different value kind)"%s" (value)
unk_2A5B82051Library search path validation failure"%s" (path)
unk_2A5B83051Missing sysroot for library resolution"%s" (library name)
unk_2A5B84032Symbol visibility conflict in linker script"%s" (symbol)
unk_2A5B85034Symbol redefined in linker script version map"%s" (symbol)
unk_2A5B86031Symbol binding override (GLOBAL to LOCAL or vice versa)"%s" (symbol)
unk_2A5B87031Unresolved symbol in linker script export list"%s" (symbol)

Internal Errors / Assertions (sub_4275C0, sub_4298C0, sub_42A2D0, main)

AddressSevSitesUsage ContextExample Message
unk_2A5B670617Internal assertion failure; catch-all for "should never happen""Internal error", "cubin not an elf?", "cubin not a device elf?", "fatbin wrong format?", "should never see bc files", "error in LTO callback", "unexpected cpuArch"
unk_2A5B6C031Mercury/finalizer phase returned error for function"%s" (function name)
unk_2A5B9906230Internal invariant violation (by far the most used descriptor)See full table below
unk_2A5BAE062Out-of-memory during TLS state allocation (sub_44F410, sub_44F670)-- (format string is the OOM message)
unk_2A5BB7065Catch-all fatal OOM (fatal_alloc); recognized by pointer comparison-- (bypass formatter, direct stderr)

ELF / Section Merging (sub_432870, sub_4325A0--sub_445000)

AddressSevSitesUsage ContextExample Message
unk_2A5B8A051Architecture state initialization failed"%s" (arch name)
unk_2A5B8B034Duplicate symbol across input files during merge"%s" (symbol name)
unk_2A5B8C032Unknown symbol referenced during merge"%s" (symbol name or "unknown")
unk_2A5B8D031Undefined external symbol encountered in section scan"%s" (symbol name)
unk_2A5B8E031Target arch (sm <= 0x78) does not support feature--
unk_2A5B8F031Relocation target mismatch during merge"%s" (symbol)
unk_2A5B90031ISA_CLASS attribute exceeds maximum (arch-specific limit, e.g., > 0x7F for sm_100a)%d (arch), "%s" ("ISA_CLASS"), "%s" (symbol)
unk_2A5B91031Missing .nv.compat section in input ELF".nv.compat"
unk_2A5B92031Register count exceeds architecture maximum"%s" (symbol), %d (count), %d (max)
unk_2A5B93032Duplicate parameter bank on weak entry symbol"%s" (symbol name)
unk_2A5B94031Conflicting callgraph edge during DCE (dead code elimination)"%s" (callee), "%s" (caller)
unk_2A5B95032Common symbol replaced by global.init with smaller size"%s" (symbol), "%s" (file)
unk_2A5B96031Call target symbol resolution conflict"%s" (target), "%s" (caller), "%s" (callee)
unk_2A5B97031Function has address taken but no direct call found"%s" (function name)
unk_2A5B98031Section type mismatch during merge index lookup"%s" (section name)
unk_2A5B9A032Section reference from unresolved external symbol%d (section index)
unk_2A5B9B031Weak function replacement failed (old data not freed)"%s" (function name)
unk_2A5B9C031Section index out of bounds during relocation"%s" (symbol), "%s" (source section), "%s" (target section)
unk_2A5B9D032Multiple definitions of same symbol (non-weak)"%s" (symbol), "%s" (file1), "%s" (file2)
unk_2A5B9E031Register count for callee exceeds caller's allocation"%s" (callee), %d (callee regs), "%s" (caller), %d (caller regs)
unk_2A5B9F031Symbol size changed between definitions"%s" (symbol), "%s" (file1), "%s" (file2)
unk_2A5BA0032Symbol binding mismatch (e.g., LOCAL vs GLOBAL)"%s" (symbol), "%s" (file1), "%s" (file2)
unk_2A5BA1033Symbol type mismatch between definitions"%s" (symbol), "%s" (file1), "%s" (file2)
unk_2A5BA2032Undefined symbol reference during final resolution"%s" (symbol), "%s" (file)
unk_2A5BA3031Symbol flagged as dead but still referenced"%s" (symbol name)
unk_2A5BA4035Resource limit exceeded (textures, samplers, surfaces)%d (count), "%s" (resource type: "textures", "samplers", "surfaces")
unk_2A5BA5013Verbose link-time info dump (function properties, section details)(formatted multi-line)
unk_2A5BA6031Common symbol max-count equals current architecture limit%d (count)
unk_2A5BA7031Undefined reference during symbol table finalization%d (index)
unk_2A5BA8033Section type conflict during merge (e.g., "local" qualifier)%d (type code), "%s" (qualifier)

NVINFO / Attribute Parsing (sub_42F6C0, sub_42F760, sub_42F850)

AddressSevSitesUsage ContextExample Message
unk_2A5BAA031Unrecognized attribute value in nvinfo section"%s" (attribute name), "%s" (value)
unk_2A5BAB031Unknown relocation attribute in nvinfo section"Relocation", %d (code), "%s" (type name)
unk_2A5BAC033Completely unknown attribute or usage code"unknown attribute", "unknown usage"

External Tool Invocation (sub_42FA70, sub_42FCB0)

AddressSevSitesUsage ContextExample Message
unk_2A5BB0051External tool (ptxas/fatbinary) terminated by signal"%s" (tool), %d (signal), "%s" (signal name)
unk_2A5BB1052Response-file parse error: unexpected character"%s" (context)
unk_2A5BB2052Response-file parse error: missing closing bracket"%s" (context)
unk_2A5BB3056Response-file parse error: malformed token"%s" (context)
unk_2A5BB4051External tool produced core dump (signal had 0x80 flag)"%s" (tool)
unk_2A5BB5051External tool path not found or not executable"%s" (tool path)

Non-Descriptor BSS Objects

AddressSizeRole
unk_2A5B500--BSS module base (used as __dso_handle argument in __cxa_atexit calls)
unk_2A5BB80512 BWarning-suppression bitmap array (one bit per descriptor, used for --Wno-* granular suppression)
unk_2A5BFC016 BError-status object: "empty/null input" returned as *a4 in file-open helpers
unk_2A5BFD016 BError-status object: "file read error" returned as *a4
unk_2A5BFE016 BError-status object: "file stat error" returned as *a4

unk_2A5B990: Internal Assertion Messages

This single descriptor accounts for 230 of the 380+ total diag_emit call sites. It serves as the universal internal-error descriptor, always severity Fatal (6). Every call site passes a literal string describing the violated invariant. The following table catalogs the 40 unique assertion messages observed:

MessageSubsystemSites
"section not found"Section lookup (various)8
"expected to be finalized"ELF finalization checks7
"symbol not found"Symbol table lookup5
"overlapping non-identical data"Section data merge5
"bank SHT not CUDA_CONSTANT_?"Constant bank validation2
"overlapping data spans too much"Section data merge2
"entry data should have offset"Constant bank entry1
"local data should have offset"Local data layout1
"tail data node not found"Section data list1
"callgraph not complete"Call graph construction1
"callgraph not found"DCE / call graph lookup1
"no callgraph node"Call graph node missing1
"reference to deleted section"Post-DCE section reference1
"reference to deleted symbol"Post-DCE symbol reference1
"symbol already assigned"Symbol table construction1
"unallocated symbol"ELF output generation1
"adding global symbols of same name"Symbol merge conflict1
"alias has not been declared"Alias resolution1
"alias must be to function"Alias validation1
"alias to unknown symbol"Alias target lookup1
"couldn't initialize arch state"Architecture init1
"duplicate ids in uft.entry"UFT table construction1
"efh not found"ELF file header missing1
"entry data cannot be GLOBAL"Constant bank binding1
"entry_sym was null"Entry point resolution1
"invalid index"Array bounds check1
"Invalid Path"File path validation1
"malformed uidx input"UFT index parsing1
"missing nv.udt.entry"Missing UDT section1
"missing nv.uft.entry"Missing UFT section1
"missing sec strtab"Section string table1
"missing std sections"Standard ELF sections1
"Negative size encountered"Size validation1
"no regcount?"Register count lookup1
"no such new reg count"Register count update1
"no such original reg count"Register count lookup1
"no symbol for index?"Symbol index dereference1
"not uidx input"UFT type check1
"null entry_sym" / "null esym" / "null root_kernel sym"Null pointer guards3
"nv.uft not found"UFT section lookup1
"overlapped offset < full offset?"Merge offset validation1
"secidx not virtual"Section type check1
"section not mapped"Output section layout1
"should only reach here with no opt"Optimization state check1
"size of uidx window != nv.udt" / "size of uidx window != nv.uft"UFT/UDT size match2
"strsec not found" / "symsec not found"Section lookup2
"UFT stub match not found"UFT stub resolution1
"unexpected bindless type"Bindless texture dispatch1
"verbose before final"Verbose dump ordering1
"writing file"File output state check1

Sentinel Descriptor: unk_2A5BB70

The sentinel descriptor unk_2A5BB70 is special: it is the "catch-all fatal" used by fatal_alloc and is recognized by both diag_emit and diag_format via explicit pointer comparison (a1 == &unk_2A5BB70). When this descriptor is detected, the formatters skip the source-location and output-channel logic and fall through directly to the simple stderr path, ensuring that even an out-of-memory condition during formatting still produces output.

Per-Thread State (TLS)

Allocation

tls_get_state (sub_44F410) manages a 0x118-byte (280-byte) per-thread state block via POSIX pthread TLS. On first call from any thread:

  1. If qword_2A5F820 is NULL (global TLS system not initialized), initializes:

    • Creates a pthread_key with destr_function (0x44F260) as destructor
    • Creates a recursive mutex (PTHREAD_MUTEX_RECURSIVE, type 1)
    • Queries scheduling priority bounds via sched_get_priority_max/min(SCHED_RR)
    • Sets up linked-list head pointers for the thread registry
  2. Allocates 0x118 bytes via malloc (not the arena allocator -- this is one of the few malloc sites in the binary)

  3. Zeroes the entire block, then initializes:

    • pthread_cond_t at offset 128
    • pthread_mutex_t at offset 176
    • sem_t at offset 216
  4. Links the new block into a global doubly-linked list protected by the global mutex (offsets 256 and 264 are the prev/next pointers)

  5. Stores the block via pthread_setspecific

TLS State Layout

TlsState (0x118 = 280 bytes)
=======================================================
Offset  Size  Field              Description
-------------------------------------------------------
  0      1    had_errors         Set to 1 when severity > 2 (error/fatal)
  1      1    had_fatal          Set to 1 when severity > 4 (error/fatal)
  8      8    longjmp_buf_ptr    Pointer to jmp_buf for fatal recovery
 16      8    last_descriptor    Pointer to descriptor that triggered longjmp
 24      8    arena_ptr          Associated memory arena
 32      8    tool_name_ptr      Tool name string for prefixing messages
 40      8    suffix_ptr         Suffix string appended after messages
 48      1    infos_suppressed   Non-zero if --disable-infos active
 49      1    warnings_suppressed Non-zero if -w (--disable-warnings) active
 50      1    werror_active      Non-zero if -Werror (--warning-as-error) active
 51      1    color_enabled      Non-zero if ANSI color output is active
 52      1    single_line_mode   Non-zero to suppress continuation-line alignment
 56      8    line_callback      Function pointer for per-line output callback
 88      8    strbuf_ptr         Scratch string buffer for output assembly
128     48    cond               pthread_cond_t for thread synchronization
176     40    mutex              pthread_mutex_t for thread-local locking
216     32    semaphore          sem_t for thread pool coordination
248      8    sem_ptr            External semaphore pointer (thread pool join)
256      8    prev               Doubly-linked list: previous TLS block
264      8    next               Doubly-linked list: next TLS block
272      1    destroyed          Flag set by destructor to prevent double-free

Thread Safety

The TLS system ensures that diagnostic output from concurrent threads (e.g., during split-compile LTO) does not interleave mid-message. Each thread assembles its complete diagnostic into its own string buffer before writing to the output channel. The global mutex at mutex protects only the thread registry linked list, not the per-thread state itself.

The longjmp_buf_ptr at offset 8 is set by the pipeline's error-recovery code (typically in main around the merge/relocate/finalize phases). When a fatal diagnostic fires, diag_emit calls longjmp through this pointer, unwinding to the recovery point. If the pointer is NULL (no recovery point set), execution falls through to fatal_dispatch which calls the installed fatal handler.

ANSI Color Tokens

When color output is enabled (tls_state[51] != 0), diag_format prepends ANSI color escape sequences via token strings that are expanded by the output channel:

TokenMeaningUsed For
@W@Warning color (typically yellow/bold)Warning-severity messages
@I@Info color (typically default/dim)Info-severity messages (level 1)
@O@Other/note color (typically cyan)Info-severity messages (level 2, labeled)
off_1D3B7A8Error/fatal color (typically red/bold)Error*, error, and fatal messages

Color support is detected by has_color_support (sub_4684A0), which simply reads the TLS flag. The flag is set during initialization based on whether stderr is a terminal (isatty) and whether NO_COLOR or similar environment variables are present.

The strbuf_append_str (sub_44FE60) function appends these tokens into the output buffer. The actual ANSI escape expansion happens in the output channel layer, not in the formatter itself -- the tokens are literal strings like "@W@" that get translated to "\033[1;33m" (or similar) at the final write stage.

Message Formatting Pipeline

diag_emit (sub_467460)

The primary entry point. Every diagnostic call site in nvlink invokes this function:

void diag_emit(DiagDescriptor *desc, ...) {
    va_list args;
    va_start(args, desc);

    // Early exit if this descriptor is suppressed
    if (desc->suppressed)
        return;

    int severity = desc->level;

    // Severity-specific suppression checks
    if (severity == 3) {                     // Warning
        if (tls_state->warnings_suppressed)
            return;
        if (tls_state->werror_active) {
            if (desc != &unk_2A5BB70)        // Not the catch-all fatal
                severity = 5;                // Promote to hard error
        }
    }
    if (severity == 2) {                     // Info
        if (tls_state->infos_suppressed)
            return;
    }
    if (severity == 0)                       // Note (silent)
        return;

    // For the catch-all fatal descriptor (unk_2A5BB70), use simple stderr path
    if (desc == &unk_2A5BB70) {
        prefix = severity_prefix_table[severity];
        // Print directly via diag_fprintf: "<tool> <prefix>: <message>\n"
        ...
        goto check_fatal;
    }

    // Normal path: assemble message in string buffer
    strbuf *buf = strbuf_create();
    strbuf_vsprintf(buf, desc->format_str, args);
    char *message = strbuf_finalize(buf);

    strbuf *out = strbuf_create();

    // Select color token based on severity
    color = select_color(severity, tls_state->color_enabled);

    // Append color reset/prefix
    strbuf_append_str(out, color);

    // Append tool name if present
    if (tls_state->tool_name_ptr)
        strbuf_sprintf(out, "%s ", tls_state->tool_name_ptr);

    // Append severity prefix ("warning ", "error   ", etc.)
    char *prefix_start = strbuf_snapshot(out);
    strbuf_sprintf(out, "%s%s", "", severity_prefix);
    int indent = strbuf_length(out) - strlen(prefix_start);
    strbuf_sprintf(out, ": ");

    // Append message body with multi-line alignment
    for (char *p = message; *p; p++) {
        strbuf_append_char(out, *p);
        if (*p == '\n' && !tls_state->single_line_mode) {
            strbuf_append_str(out, prefix_start);
            for (int i = 0; i < indent; i++)
                strbuf_append_char(out, ' ');
            strbuf_append_str(out, ". ");    // continuation marker
        }
    }

    // Append suffix if present
    if (tls_state->suffix_ptr)
        strbuf_sprintf(out, " %s", tls_state->suffix_ptr);

    strbuf_append_char(out, '\n');
    char *final = strbuf_finalize(out);

    // Route to output channel
    FILE *channel = output_channels[channel_index_for_severity];
    if (channel)
        channel_write(channel, "%s%s", "", final);
    else
        diag_printf("%s%s", "", final);

    // Set error/fatal flags
check_fatal:
    if (severity > 2)
        tls_state->had_errors = 1;
    if (severity > 4)
        tls_state->had_fatal = 1;
    if (severity > 5) {
        // Fatal: attempt longjmp recovery
        if (tls_state->longjmp_buf_ptr) {
            tls_state->last_descriptor = desc;
            longjmp(tls_state->longjmp_buf_ptr, 1);
        }
        fatal_dispatch();  // No recovery point; terminate
    }
}

diag_format (sub_467A70)

The extended formatter adds source-location awareness. It takes an additional SourceLocation * parameter:

void diag_format(DiagDescriptor *desc, SourceLocation *loc, ...) {
    // ... same suppression/promotion logic as diag_emit ...

    // Extract source location metadata
    if (loc && loc->file_info && loc->file_info->diag_context) {
        uint32_t flags = *loc->file_info->diag_context;
        collect_for_index = flags & 1;        // bit 0: collect into diagnostic index
        show_source       = (flags >> 1) & 1; // bit 1: display source snippet
        embed_in_output   = (flags >> 2) & 1; // bit 2: embed in structured output
    }

    // ... same message assembly ...

    // Insert source location before severity prefix
    if (loc && loc->file_info) {
        if (loc->line != 0x0FFFFFFF)  // sentinel for "no line"
            strbuf_sprintf(out, "%s, line %d; ", loc->file_info->filename, loc->line);
    }

    // ... severity prefix, message body, suffix ...

    // Source snippet display (when show_source flag is set)
    if (show_source) {
        // Open source file (cached per file_info)
        if (current_file != loc->file_info) {
            // Close previous file
            if (line_index)
                hash_table_free(line_index);
            fclose(cached_fp);

            // Open new file and build line-offset index
            fp = fopen(loc->file_info->filename, "r");
            cached_fp = fp;
            line_index = hash_table_create(identity_hash, int_equal, 0x400);

            // Scan file, recording byte offset every 10 lines
            uint32_t line_count = 0;
            while ((ch = fgetc(fp)) != EOF) {
                if (ch == '\n') {
                    line_count++;
                    if (line_count % 10 == 0)
                        hash_insert(line_index, line_count / 10, ftell(fp));
                }
                // skip non-newline chars in tight loop
            }
        }

        // Seek to target line using index (nearest 10-line boundary)
        uint32_t target = loc->line - 1;
        long offset = hash_lookup(line_index, target / 10);
        fseek(cached_fp, offset, SEEK_SET);

        // Read and display remaining lines to reach target
        uint32_t remaining = target % 10;
        do {
            if (feof(cached_fp)) {
                snippet = NULL;
            } else {
                strbuf *line_buf = strbuf_create();
                strbuf_append_str(line_buf, "# ");  // source line prefix
                int ch = fgetc(cached_fp);
                while (ch != '\n' && ch != EOF) {
                    strbuf_append_char(line_buf, ch);
                    ch = fgetc(cached_fp);
                }
                strbuf_append_char(line_buf, '\n');
                snippet = strbuf_finalize(line_buf);
            }
        } while (remaining-- > 0);

        // Prepend snippet to output
        if (snippet)
            output = snippet + output;
    }

    // ... output routing, error flags, longjmp ...

    // If collect_for_index flag is set, store diagnostic record
    if (collect_for_index) {
        DiagRecord *rec = arena_alloc(24);
        rec->location = loc;
        rec->descriptor = desc;
        rec->message = message;
        list_append(loc->file_info->diag_list, rec);
    }
}

Source Snippet Display

When diag_format is invoked with a source location that has the "show source" flag set, the formatter opens the referenced source file and displays the offending line prefixed with # . The file is kept open and cached between diagnostic emissions for the same file. A hash table index maps every 10th line number to a byte offset in the file, enabling efficient seeking to arbitrary lines without scanning from the start.

Example output with source snippet:

# __global__ void kernel(int *data) {
input.cu, line 5; error   : undefined reference to 'missing_symbol'

The 0x0FFFFFFF sentinel in the line field means "no line number available" and suppresses the ", line %d; " portion of the location string.

Output Channels

Stream Selection

Diagnostic output is routed through a 5-element channel table at qword_2A5F8A0:

qword_2A5F8A0[0]  --  Info channel (severity 1-2)
qword_2A5F8A0[1]  --  Warning channel (severity 3-4)
qword_2A5F8A0[2]  --  Error channel (severity 5)
qword_2A5F8A0[3]  --  (unused)
qword_2A5F8A0[4]  --  Fatal channel (severity 6)

The channel index is determined by the byte_1D3C728 severity-to-channel mapping table:

Severitybyte_1D3C728[severity]Channel
0 (note)0-- (never reached)
1 (info)0qword_2A5F8A0[0]
2 (info labeled)0qword_2A5F8A0[0]
3 (warning)1qword_2A5F8A0[1]
4 (error*)1qword_2A5F8A0[1]
5 (error)2qword_2A5F8A0[2]
6 (fatal)4qword_2A5F8A0[4]

If the selected channel pointer is NULL, output falls back to diag_printf which writes to qword_2A5F3A0 (the diagnostic stream) or stderr if that pointer is also NULL.

channel_write (sub_45B7B0)

A polymorphic output function that dispatches based on the channel object's type field at offset 0:

TypeBehavior
0, 1Formats into a string buffer, passes to sub_45B6D0 (string accumulator)
2Appends directly to a strbuf object at channel + 32
3vfprintf to a FILE* at channel + 32 (or stdout if NULL)
4vsprintf into buffer at channel + 32, advances pointer

Default Stream

The diagnostic stream qword_2A5F3A0 defaults to NULL, causing output to go to stderr. It can be redirected via set_diag_stream (sub_44A000). Both diag_fprintf and diag_vfprintf check this pointer before each write:

int diag_fprintf(const char *fmt, ...) {
    FILE *f = qword_2A5F3A0;
    if (!f)
        f = stderr;
    return vfprintf(f, fmt, args);
}

Fatal Error Handling

Recovery via longjmp

The linker pipeline sets up a setjmp/longjmp recovery point around error-prone phases. The jmp_buf pointer is stored in tls_state->longjmp_buf_ptr (offset 8). When a fatal diagnostic fires (severity > 5):

  1. diag_emit checks if longjmp_buf_ptr is non-NULL
  2. If set, stores the triggering descriptor in tls_state->last_descriptor (offset 16)
  3. Calls longjmp(longjmp_buf_ptr, 1), unwinding to the recovery point
  4. The recovery code in main can inspect last_descriptor to determine what went wrong
  5. If longjmp_buf_ptr is NULL, falls through to fatal_dispatch

fatal_dispatch (sub_44A410) and fatal_exit (sub_44A440)

fatal_dispatch calls through off_2A5BA98, a function pointer that defaults to fatal_exit. The handler installation function install_handlers (sub_44A3F0) allows replacing both the no-arg fatal handler and the exit-with-code handler:

void install_handlers(void (*fatal_handler)(void), void (*exit_handler)(uint8_t)) {
    if (fatal_handler)
        off_2A5BA98 = fatal_handler;   // fatal_dispatch target
    if (exit_handler)
        off_2A5BA90 = exit_handler;    // exit_dispatch target
}

fatal_exit (sub_44A440) implements the final termination:

_Noreturn void fatal_exit(void) {
    if (byte_2A5F358)      // --trap-into-debugger flag
        abort();           // Generate SIGABRT for debugger attachment
    exit_dispatch(1);      // Call exit handler with code 1
}

The --trap-into-debugger flag (byte_2A5F358) is set by set_trap_flag (sub_42FA60), which is invoked when the -_trap_ / --trap-into-debugger CLI option is present. This causes abort() instead of exit(1), generating a core dump and SIGABRT for post-mortem debugging.

Assertion Failures

The assertion system uses a two-step process:

  1. set_assert_location (sub_457FF0) stores the file path and line number in globals qword_2A5F880 and dword_2A5F878
  2. assertion_failure (sub_458000) formats and prints the failure message:
void assertion_failure(const char *fmt, va_list args) {
    if (has_color_support())
        diag_printf(ERROR_COLOR);     // off_1D3B7A8
    diag_printf("Assertion failure at %s, line %d: ",
                qword_2A5F880, dword_2A5F878);
    diag_write_buffered(fmt, args);
    diag_printf("\n");
}

This is invoked by assertion macros scattered throughout the codebase. The location globals are set immediately before the assertion check so that the file/line information is always fresh.

Exit Codes

CodeMeaningSet By
0SuccessNormal completion in main
1Errorfatal_exit default, or main detecting tls_state->had_errors
2Usage errorOption parsing failures in nvlink_parse_options
-1Internal errorReturned by some internal functions on unexpected failure

CLI Options Affecting Diagnostics

OptionFlagEffect
-w / --disable-warningstls_state[49]Suppresses all severity-3 (warning) messages
-Werror / --warning-as-errortls_state[50]Promotes warnings to hard errors (severity 5)
--disable-infostls_state[48]Suppresses all severity-2 (info) messages
--extra-warnings(separate flag)Enables additional diagnostic descriptors
-_trap_ / --trap-into-debuggerbyte_2A5F358Fatal errors call abort() instead of exit(1)

String Buffer Subsystem

The diagnostic formatter does not use fixed-size char[] buffers. Instead, it assembles messages through a linked-list string buffer (strbuf) that grows dynamically via the arena allocator.

strbuf Layout

strbuf (40 bytes)
=======================================================
Offset  Size  Field        Description
-------------------------------------------------------
  0      8    arena_ptr    Arena for allocations (copied from TLS state)
  8      8    total_len    Total bytes appended so far
 16      8    chain_head   Pointer to first chunk in linked list
 24      8    chain_tail   Pointer to last chunk's next-pointer slot
 32      8    last_chunk   Pointer to current (last) chunk for fast append

Each chunk is a 24-byte header followed by a data buffer:

strbuf_chunk (24 bytes + data)
=======================================================
Offset  Size  Field        Description
-------------------------------------------------------
  0      8    capacity     Total capacity of this chunk's data buffer
  8      8    remaining    Bytes remaining in this chunk
 16      8    data_ptr     Pointer to the data buffer

The fast path in strbuf_vsprintf uses a 1024-byte stack buffer. If the formatted message exceeds 1024 bytes, it allocates from the arena.

strbuf_finalize walks the chunk chain, copies all data into a single contiguous allocation, and frees the chunks. strbuf_snapshot does the same but leaves the buffer intact for further appending.

Global State Summary

AddressTypeNameDescription
qword_2A5F3A0FILE*diag_streamOutput stream (NULL = stderr)
qword_2A5F820TlsState*tls_headHead of per-thread state linked list
qword_2A5F8A0Channel*[5]output_channelsPer-severity output channel pointers
byte_2A5F358uint8_ttrap_into_debuggerSet by --trap-into-debugger
qword_2A5F880char*assert_fileFile path for assertion failure messages
dword_2A5F878uint32_tassert_lineLine number for assertion failure messages
off_2A5BA98fn_ptrfatal_handlerInstalled fatal handler (default: fatal_exit)
off_2A5BA90fn_ptrexit_handlerInstalled exit handler (default: exit)
qword_1D3C740char*[7]prefix_tableSeverity-to-prefix-string lookup table
byte_1D3C728uint8_t[7]channel_mapSeverity-to-channel-index mapping
0x2A5B530..0x2A5BB70DiagDescriptor[88]diag_descriptorsStatic diagnostic descriptor table (16 bytes each)

Cross-References

Internal (nvlink wiki):

  • Memory Arenas -- Arena allocator used by string buffer functions (sub_44FB20, sub_4307C0) for message assembly
  • Environment Variables -- __NVLINK_STDERR_REDIRECT and NVLINK_DEBUG env vars that affect diagnostic output routing
  • CLI Flags -- --disable-warnings, -w, -Werror, --trap-into-debugger flags that modulate diagnostic severity
  • elfLink Error Codes -- elfLink subsystem error codes (0--13) that route through diag_emit for user-visible messages
  • Thread Pool -- Per-thread TLS state (sub_44F410) shared between the thread pool and diagnostic infrastructure
  • Pipeline Entry -- main() error recovery via setjmp/longjmp that fatal_exit and diag_emit severity 6 trigger

Sibling wikis:

  • ptxas: Threading -- ptxas-side threading infrastructure that shares the TLS pattern
  • cicc: Diagnostics -- cicc diagnostic subsystem for comparison with nvlink's error reporting architecture
  • cicc: Knobs -- cicc environment variable controls analogous to nvlink's NVLINK_DEBUG

Confidence Assessment

ClaimConfidenceEvidence
diag_emit at sub_467460 is the variadic entry pointHIGHDecompiled sub_467460 starts with va_start(arg, a1) and checks *(BYTE*)(a1 + 4) for suppression -- matches descriptor layout
Severity levels 0-6 with described prefixesHIGHStrings "error " at 0x1d3c672 and "fatal " at 0x1d3c690 confirmed in strings JSON; both 8 chars with trailing spaces
fatal_exit at sub_44A440 checks byte_2A5F358 then abort/exitHIGHDecompiled exactly: if (byte_2A5F358) abort(); sub_44A420(1u);
Assertion failure format stringHIGHsub_458000 decompiled: "Assertion failure at %s, line %d: " with qword_2A5F880 / dword_2A5F878 -- matches string at 0x1d3b7b0
--trap-into-debugger flag at byte_2A5F358HIGHsub_44A440 reads byte_2A5F358; string "trap-into-debugger" at 0x1d3294f; "Trap into debugger upon assertion failures" at 0x1d33c80
-Werror / -w / --disable-infos CLI optionsHIGHStrings "Werror" at 0x1d3261e, "disable-warnings" at 0x1d325f0, "disable-infos" at 0x1d32654 all confirmed
88 diagnostic descriptors in BSS 0x2A5B530--0x2A5BB70MEDIUMDescriptor count derived from systematic xref analysis of sub_467460 call sites; individual descriptors verified at representative addresses
unk_2A5B990 accounts for 230 call sites (internal assertions)HIGHThis is the most-referenced descriptor; assertion messages like "section not found" (at 0x1f45d28), "symbol not found" (at 0x1d38978), "overlapping non-identical data" (at 0x1d387d8) all confirmed in strings
Sentinel descriptor unk_2A5BB70 used by fatal_allocHIGHsub_45CAC0 decompiled as one-liner: return sub_467460(&unk_2A5BB70, ...)
TLS state block is 0x118 bytes (280) via pthread TLSHIGHsub_44F410 decompiled references pthread_getspecific/pthread_setspecific and allocates via malloc(0x118)
Per-thread state includes longjmp_buf_ptr at offset 8MEDIUMOffset derived from decompiled sub_467460 code paths that read *(QWORD*)(tls + 8) before longjmp call
Source snippet display with line-index hash tableMEDIUMInferred from decompiled sub_467A70 (13,105 B) which contains fopen/fgetc/fseek patterns; exact hash-table indexing logic complex to verify
Output channel table qword_2A5F8A0[5]MEDIUMChannel array referenced in decompiled sub_467460 via severity-indexed access; exact array size inferred from severity count
channel_write at sub_45B7B0 dispatches by type fieldMEDIUMFunction exists (192 B decompiled) with switch-like structure on first field
Warning-to-error promotion when tls_state[50] != 0HIGHDecompiled sub_467460 shows conditional severity override when -Werror flag is set
ANSI color tokens @W@, @I@, @O@MEDIUMToken strings inferred from decompiled string builder operations; not directly found as standalone strings (embedded in format logic)
has_color_support at sub_4684A0 reads TLS flagHIGHFunction exists in decompiled output (16 B), returns a byte from the TLS state