Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

R_MERCURY Relocations

nvlink defines 65 Mercury-specific ELF relocation types for the capsule Mercury (capmerc) binary format, used on sm100+ architectures (Blackwell and later). These types are stored in .rela.* sections of capmerc ELF files and are consumed by the same relocation application engine that handles R_CUDA types, but through a separate descriptor table. Each R_MERCURY type in the ELF is encoded as the table index plus 0x10000 -- the relocation engine subtracts this offset at dispatch time to index into the Mercury-specific descriptor table.

Mercury relocations are structurally simpler than their R_CUDA counterparts. Where R_CUDA has 119 types covering six generations of SASS instruction encoding (each with different bit-field positions for the same logical operation), R_MERCURY has 65 types that target a single 128-bit instruction format. The R_MERCURY set eliminates the per-bit-position variants (R_CUDA_ABS32_20, R_CUDA_ABS32_23, R_CUDA_ABS32_26, R_CUDA_ABS32_32) in favor of position-independent types (R_MERCURY_ABS32) -- the bit-field position is encoded in the 64-byte descriptor rather than in the type name.

Key Facts

PropertyValue
Machine typeEM_CUDA (190) with ELF class byte 0x41 ('A')
Total unique type names65 (indices 0--64)
ELF type encodingTable index + 0x10000
Mercury name tableoff_1D371E0 (65 entries, pointers at string addresses 0x1D35A17--0x1D35F4C)
Mercury descriptor tableoff_1D3CBE0 (65 entries, 64 bytes each = 4,160 bytes)
CUDA descriptor tableoff_1D3DBE0 (117 entries, 64 bytes each = 7,488 bytes)
Descriptor size64 bytes per type (12-byte header + 3 actions x 16 bytes + 4-byte sentinel)
Application enginesub_468760 at 0x468760 (14,322 bytes, shared with R_CUDA)
Relocation phase entrysub_469D60 at 0x469D60 (performs table selection and symbol resolution)
Relocation record buildersub_469B50 at 0x469B50 (creates reloc linked-list entries)
Validation functionsub_42F6C0 at 0x42F6C0 (shared with R_CUDA)
Attribute validationsub_42F760 at 0x42F760 (three-way dispatch: warn/error/ignore)
Bitfield extract helpersub_468670 at 0x468670 (multi-word recursive extraction)
Bitfield write helpersub_4685B0 at 0x4685B0 (multi-word bit splicing)
Table limit checkIndex < 0x41 (65) for Mercury; < 0x75 (117) for CUDA
Third name table copy0x2459160 (capmerc uplift / self-check path)
Architecture scopesm100 (Blackwell), sm103 (Blackwell Ultra), sm110, sm120, sm121

ELF Type Encoding and Dispatch

When the linker loads a relocation entry from a capmerc ELF file, the r_info type field contains the Mercury type index plus 0x10000. The dispatch logic in the relocation phase function (sub_469D60) detects this offset and routes to the Mercury descriptor table. The code from the decompiled sub_469D60 at lines 188--215 shows the exact branching:

// sub_469D60 at 0x469D60 -- relocation phase dispatcher
// v2 = linker context, v5 = rela entry (16-byte __m128i), v9 = reloc type

v7 = 1;
v8 = rela_entry->type_and_sym;          // r_info from Elf64_Rela
if (*(byte*)(context + 7) != 0x41)       // check ELF class byte at offset +7
    v7 = 0x80000000;                     // non-Mercury flag
v9 = rela_entry->type_and_sym;

if ((v7 & *(uint32*)(context + 48)) != 0) {  // Mercury path
    if ((uint32_t)v8 != 0) {
        if ((uint32_t)v8 <= 0x10000) {
            fatal("unexpected reloc");        // Mercury types must be > 0x10000
        }
        descriptor_table = &off_1D3CBE0;      // Mercury descriptor table
        type_index = v9 - 0x10000;            // subtract offset to get raw index
    } else {
        descriptor_table = &off_1D3CBE0;
        type_index = 0;                       // R_MERCURY_NONE
    }
} else {                                      // CUDA path
    type_index = rela_entry->type;
    descriptor_table = &off_1D3DBE0;          // CUDA descriptor table
}

The 0x10000 namespace separation ensures that R_MERCURY and R_CUDA type numbers never collide. A raw ELF type value of 0x10000 maps to Mercury index 0 (R_MERCURY_NONE); 0x10001 maps to index 1 (R_MERCURY_G64); and so on through 0x10040 for index 64 (R_MERCURY_NONE_LAST).

The ELF class byte at offset +7 of the linker context distinguishes Mercury from CUDA. The value 0x41 (ASCII 'A') indicates a Mercury ELF. When this byte is 0x41, the dispatcher sets the Mercury flag to 1; otherwise it sets 0x80000000 (the CUDA/traditional cubin flag). The AND with the e_flags word at context offset +48 determines which path is taken.

Relocation Record Builder: sub_469B50

Before the relocation engine runs, the record builder at sub_469B50 creates linked-list entries for each relocation. This function also performs the table selection and index normalization:

// sub_469B50 at 0x469B50 -- builds relocation records
v6 = 0x80000000;
if (*(byte*)(context + 7) == 0x41)       // Mercury ELF?
    v6 = 1;

// Validate the relocation type against the target architecture
result = sub_42F6C0(reloc_type, arch_class, (v6 & e_flags) != 0, ...);
if (!result) return;

// Select descriptor table for action-type checking
if ((v6 & e_flags) != 0) {               // Mercury path
    normalized_index = reloc_type - 0x10000;
    table = &off_1D3CBE0;
    if (normalized_index <= 0x3F) {        // index < 64
        // Check if action type indicates a special reloc (codes 12-15)
        if ((uint32_t)(table[8 * normalized_index + 2].hi - 12) <= 3)
            sub_44C010(context, section_idx);   // mark section for special handling
    }
} else {                                  // CUDA path
    normalized_index = reloc_type;
    table = &off_1D3DBE0;
    if (normalized_index <= 0x73) {        // index < 116
        // Same action-type check for CUDA
        ...
    }
}

The action-type range check (action_type - 12) <= 3 catches types with action codes 12 through 15, which correspond to special relocation operations that require the target section to be marked for deferred processing.

Naming Convention

R_MERCURY type names follow a simpler pattern than R_CUDA:

R_MERCURY_<category>[<bits>][_<byte_offset>]

Because Mercury targets a single 128-bit instruction format (not the variable 64/128-bit encodings of previous SASS generations), there is no need for per-bit-position suffixes. The components are:

  • Category: the semantic class of the relocation (ABS, G, PROG_REL, FUNC_DESC, UNIFIED, TEX, SAMP, SURF)
  • Bits: the width of the relocated value (8, 16, 32, 64)
  • Byte offset: for byte-level variants, the bit offset within the 64-bit word (_0, _8, _16, _24, _32, _40, _48, _56)
  • HI/LO: for split relocations, which half of a 32-bit value is patched

Complete Type Catalog

Sentinel Types

IndexELF TypeNameDescription
00x10000R_MERCURY_NONENo relocation (placeholder / deleted entry)
640x10040R_MERCURY_NONE_LASTSentinel marking end of valid type range

R_MERCURY_NONE at index 0 serves the same purpose as R_CUDA_NONE: it is a no-op relocation used for entries that have been eliminated by dead code removal or relocation merging. The application engine returns immediately when it encounters a descriptor with all-zero action types. R_MERCURY_NONE_LAST at index 64 is a bounds sentinel -- the validation function rejects any type index >= 65.

Global Address Relocation

IndexELF TypeNameBitsDescription
10x10001R_MERCURY_G646464-bit global memory address

R_MERCURY_G64 is the primary relocation for global memory address references. It computes S + A (symbol value plus addend) and writes the full 64-bit result. Used for references to symbols in .nv.global and similar global data sections. This is the Mercury equivalent of R_CUDA_G64.

Absolute Data Relocations

IndexELF TypeNameBitsDescription
20x10002R_MERCURY_ABS646464-bit absolute address
30x10003R_MERCURY_ABS323232-bit absolute address
40x10004R_MERCURY_ABS161616-bit absolute address
50x10005R_MERCURY_ABS32_LO16Low 16 bits of 32-bit absolute
60x10006R_MERCURY_ABS32_HI16High 16 bits of 32-bit absolute

These relocations apply to data sections and instruction immediate fields using absolute addressing. The engine computes S + A and writes the result (or a 16-bit portion of it for _LO/_HI variants).

The ABS32_LO / ABS32_HI pair is used when a 32-bit absolute address must be split across two instruction immediates. One instruction loads the low 16 bits, the other loads the high 16 bits. The linker resolves both from the same symbol.

Unlike R_CUDA, which has separate types for each bit-position within the instruction word (R_CUDA_ABS32_20, R_CUDA_ABS32_23, R_CUDA_ABS32_26, R_CUDA_ABS32_32), the Mercury types are position-independent -- the bit-field offset is stored in the 64-byte descriptor entry.

PC-Relative Relocations

IndexELF TypeNameBitsDescription
70x10007R_MERCURY_PROG_REL646464-bit PC-relative offset
80x10008R_MERCURY_PROG_REL323232-bit PC-relative offset
90x10009R_MERCURY_PROG_REL32_LO16Low 16 bits of 32-bit PC-relative
100x1000AR_MERCURY_PROG_REL32_HI16High 16 bits of 32-bit PC-relative

PC-relative relocations compute (S + A) - PC, where PC is the address of the instruction being patched. These are used for branch instructions (BRA, CALL, BRX) in Mercury's 128-bit instruction format.

The 64-bit variant (PROG_REL64) supports the full address space. The 32-bit variant limits branch distances to +/- 2 GB, sufficient for all practical kernel sizes. The _LO/_HI split variants handle cases where the PC-relative offset must be encoded in two separate instruction fields.

In the application engine, PC-relative relocations use action type 0x10 (pc_rel), which computes (int32_t)(S + A) - section_offset. The relocation phase function (sub_469D60) validates at line 409--410 that PC-relative branch targets reside in the same section:

if (descriptor[5] == 16 && rela_entry.section_idx != target_section_idx)
    fatal("PC relative branch address should be in the same section");

Texture, Sampler, and Surface Relocations

IndexELF TypeNameDescription
110x1000BR_MERCURY_TEX_HEADER_INDEXTexture header table index
120x1000CR_MERCURY_SAMP_HEADER_INDEXSampler header table index
130x1000DR_MERCURY_SURF_HEADER_INDEXSurface header table index

These relocations resolve bindable resource references -- texture, sampler, and surface objects -- by patching the merged header table index into instruction fields. During linking, individual per-module header tables are merged into a single global table; these relocations update the instruction operands to reference the correct entry in the merged table.

These are direct equivalents of R_CUDA_TEX_HEADER_INDEX, R_CUDA_SAMP_HEADER_INDEX, and R_CUDA_SURF_HEADER_INDEX. Mercury drops the SLOT, HW_DESC, and HW_SW_DESC variants that exist in R_CUDA, consolidating texture resource binding into the header index mechanism alone.

Clear Relocation

IndexELF TypeNameBitsDescription
140x1000ER_MERCURY_UNUSED_CLEAR6464Zero 64 bits (clear unused field)

Writes zeros to a 64-bit field. Used to clear placeholder entries in merged data sections or nullify unused instruction fields after linking. Uses action types 0x13 or 0x14 (clear), which simply AND-zero the target bit-field without computing any symbol value. Equivalent to R_CUDA_UNUSED_CLEAR64.

Function Descriptor Relocation

IndexELF TypeNameBitsDescription
150x1000FR_MERCURY_FUNC_DESC_646464-bit function descriptor reference

Resolves a reference to a function descriptor entry. Function descriptors are used for indirect calls, virtual function tables, and device-side function pointers. The relocation writes the 64-bit address of the descriptor entry.

Byte-Level Relocations (R_MERCURY_8_*)

IndexELF TypeNameByte offsetDescription
160x10010R_MERCURY_8_00Byte at bit offset 0
170x10011R_MERCURY_8_81Byte at bit offset 8
180x10012R_MERCURY_8_162Byte at bit offset 16
190x10013R_MERCURY_8_243Byte at bit offset 24
200x10014R_MERCURY_8_324Byte at bit offset 32
210x10015R_MERCURY_8_405Byte at bit offset 40
220x10016R_MERCURY_8_486Byte at bit offset 48
230x10017R_MERCURY_8_567Byte at bit offset 56

Byte-granularity relocations that patch a single 8-bit byte at a specific offset within a 64-bit data word. The eight types cover all byte positions within a 64-bit value. These are used for patching descriptor tables, attribute sections, and other data structures where individual bytes must be resolved separately.

Direct equivalents of R_CUDA_8_0 through R_CUDA_8_56.

Global Byte-Level Relocations (R_MERCURY_G8_*)

IndexELF TypeNameByte offsetDescription
240x10018R_MERCURY_G8_00Global byte at bit offset 0
250x10019R_MERCURY_G8_81Global byte at bit offset 8
260x1001AR_MERCURY_G8_162Global byte at bit offset 16
270x1001BR_MERCURY_G8_243Global byte at bit offset 24
280x1001CR_MERCURY_G8_324Global byte at bit offset 32
290x1001DR_MERCURY_G8_405Global byte at bit offset 40
300x1001ER_MERCURY_G8_486Global byte at bit offset 48
310x1001FR_MERCURY_G8_567Global byte at bit offset 56

Byte-granularity relocations for global memory addresses. These are the byte-level counterparts of R_MERCURY_G64 -- instead of writing a full 64-bit global address, they write a single byte of the address at a specific position. Used when a global address must be assembled byte-by-byte in a data structure.

Direct equivalents of R_CUDA_G8_0 through R_CUDA_G8_56.

Function Descriptor Byte-Level Relocations (R_MERCURY_FUNC_DESC_8_*)

IndexELF TypeNameByte offsetDescription
320x10020R_MERCURY_FUNC_DESC_8_00Descriptor byte at bit offset 0
330x10021R_MERCURY_FUNC_DESC_8_81Descriptor byte at bit offset 8
340x10022R_MERCURY_FUNC_DESC_8_162Descriptor byte at bit offset 16
350x10023R_MERCURY_FUNC_DESC_8_243Descriptor byte at bit offset 24
360x10024R_MERCURY_FUNC_DESC_8_324Descriptor byte at bit offset 32
370x10025R_MERCURY_FUNC_DESC_8_405Descriptor byte at bit offset 40
380x10026R_MERCURY_FUNC_DESC_8_486Descriptor byte at bit offset 48
390x10027R_MERCURY_FUNC_DESC_8_567Descriptor byte at bit offset 56

Byte-level relocations for function descriptor addresses in data sections. These patch individual bytes of a 64-bit function descriptor reference, mirroring the R_MERCURY_FUNC_DESC_64 / byte-level split pattern used by the other relocation families.

Direct equivalents of R_CUDA_FUNC_DESC_8_0 through R_CUDA_FUNC_DESC_8_56.

Absolute PC-Relative Relocations

IndexELF TypeNameBitsDescription
400x10028R_MERCURY_ABS_PROG_REL32_LO16Low 16 bits of absolute PC-relative 32-bit
410x10029R_MERCURY_ABS_PROG_REL32_HI16High 16 bits of absolute PC-relative 32-bit
600x1003CR_MERCURY_ABS_PROG_REL323232-bit absolute PC-relative offset
610x1003DR_MERCURY_ABS_PROG_REL646464-bit absolute PC-relative offset

These are hybrid relocations that combine absolute and PC-relative semantics. The computation is |S + A - PC| or a variant that uses the absolute value of the PC-relative offset, used in instruction encodings that require an unsigned distance rather than a signed offset.

The _LO/_HI split variants (indices 40--41) appear earlier in the table than the full-width variants (indices 60--61), suggesting they were added in an earlier revision and the full-width types were appended later.

No direct R_CUDA equivalent exists for these types. They appear to be Mercury-specific additions for the 128-bit instruction format's distance-based addressing modes.

PC-Relative Byte-Level Relocations (R_MERCURY_PROG_REL8_*)

IndexELF TypeNameByte offsetDescription
420x1002AR_MERCURY_PROG_REL8_00PC-relative byte at bit offset 0
430x1002BR_MERCURY_PROG_REL8_81PC-relative byte at bit offset 8
440x1002CR_MERCURY_PROG_REL8_162PC-relative byte at bit offset 16
450x1002DR_MERCURY_PROG_REL8_243PC-relative byte at bit offset 24
460x1002ER_MERCURY_PROG_REL8_324PC-relative byte at bit offset 32
470x1002FR_MERCURY_PROG_REL8_405PC-relative byte at bit offset 40
480x10030R_MERCURY_PROG_REL8_486PC-relative byte at bit offset 48
490x10031R_MERCURY_PROG_REL8_567PC-relative byte at bit offset 56

Byte-level PC-relative relocations. These compute (S + A) - PC and then extract a single byte at the specified offset from the result. Used for data structures that assemble PC-relative offsets byte-by-byte.

No direct R_CUDA equivalent exists. These are Mercury-specific additions for use in data tables (jump tables, exception tables) where PC-relative distances are stored in byte-addressable format.

Unified Table Relocations (R_MERCURY_UNIFIED_*)

IndexELF TypeNameBitsDescription
500x10032R_MERCURY_UNIFIEDspecialUnified table reference (generic marker)
510x10033R_MERCURY_UNIFIED_323232-bit unified table offset
520x10034R_MERCURY_UNIFIED_8_08Unified byte at bit offset 0
530x10035R_MERCURY_UNIFIED_8_88Unified byte at bit offset 8
540x10036R_MERCURY_UNIFIED_8_168Unified byte at bit offset 16
550x10037R_MERCURY_UNIFIED_8_248Unified byte at bit offset 24
560x10038R_MERCURY_UNIFIED_8_328Unified byte at bit offset 32
570x10039R_MERCURY_UNIFIED_8_408Unified byte at bit offset 40
580x1003AR_MERCURY_UNIFIED_8_488Unified byte at bit offset 48
590x1003BR_MERCURY_UNIFIED_8_568Unified byte at bit offset 56
620x1003ER_MERCURY_UNIFIED32_LO16Low 16 bits of 32-bit unified offset
630x1003FR_MERCURY_UNIFIED32_HI16High 16 bits of 32-bit unified offset

Unified table relocations handle references to the Unified Descriptor Table (UDT) and Unified Function Table (UFT). These tables are used for CUDA Dynamic Parallelism and indirect function calls.

R_MERCURY_UNIFIED (index 50) is a generic marker type, not a data-patching relocation. During the relocation phase, unified relocations targeting synthetic symbols (__UFT_OFFSET, __UDT_OFFSET, __UFT_CANONICAL, __UDT_CANONICAL, __UDT, __UFT, __UFT_END, __UDT_END) are resolved to type 0 (no-op) because the unified table manager computes final offsets before the relocation engine runs. The code in sub_469D60 lines 377--395 explicitly checks for the __UFT_OFFSET symbol by string comparison and emits a debug trace when matched:

// Check if symbol name is "__UFT_OFFSET" (13-byte comparison)
if (strncmp(symbol_name, "__UFT_OFFSET", 13) == 0) {
    if (debug_flags & 4)
        fwrite("ignore reloc on UFT_OFFSET\n", 1, 0x1B, stderr);
    rela_entry.type = 0;   // convert to R_MERCURY_NONE
    ...
}

The UNIFIED32_LO and UNIFIED32_HI types (indices 62--63) appear after the ABS_PROG_REL types in the table rather than adjacent to the other unified types (50--59). This suggests they were added in a later revision.

Two additional trailing-space variants ("R_MERCURY_UNIFIED_8_0 " and "R_MERCURY_UNIFIED_8_8 ") appear at separate string addresses (0x1D3CB71 and 0x1D3CB88) in the Mercury descriptor table region. These are duplicate name strings used by the descriptor table entries themselves, distinct from the primary name table entries (which lack the trailing space). The trailing space is a formatting artifact in the binary's .rodata, not a distinct relocation type.

The Sub-Byte Relocation Mechanism

Mercury introduces a sub-byte relocation mechanism unique to GPU instruction patching. The byte-level relocation families (R_MERCURY_8_*, R_MERCURY_G8_*, R_MERCURY_FUNC_DESC_8_*, R_MERCURY_PROG_REL8_*, R_MERCURY_UNIFIED_8_*) each provide eight variants covering every byte position within a 64-bit word. The mechanism works as follows:

How Sub-Byte Patching Works

  1. Full address computation: The engine first computes the full relocation value (S + A for absolute, (S + A) - PC for PC-relative).

  2. Byte extraction: The descriptor's action type encodes which byte of the 64-bit result to extract. The action types 0x16--0x1D and 0x2F--0x36 (masked_shift operations) use paired SSE constant vectors loaded from xmmword_1D3F8E0--xmmword_1D3F930. These contain 64-bit masks and shift counts indexed by (action_type - 22):

// sub_468760, case 0x16..0x1D / 0x2F..0x36 (masked_shift)
int idx = action_type - 22;
uint64_t mask  = mask_table[idx];   // from xmmword_1D3F920/xmmword_1D3F930
uint64_t shift = shift_table[idx];  // from xmmword_1D3F8E0..xmmword_1D3F910
value = (value & mask) >> shift;
  1. Bit-field insertion: The extracted byte is written into the target using the descriptor's bit_offset and bit_width fields, via the sub_4685B0 helper.

Multi-Word Boundary Handling

Both the extraction (sub_468670) and write (sub_4685B0) helpers handle bit-fields that span 64-bit word boundaries -- a common situation with 128-bit Mercury instructions. The extraction function is recursive:

// sub_468670 -- bitfield_extract (simplified)
int64_t bitfield_extract(uint64_t* words, int bit_offset, int bit_width) {
    if (bit_offset >= 64) {
        words += bit_offset / 64;
        bit_offset %= 64;
    }
    int end_bit = bit_offset + bit_width;
    if (end_bit <= 64)
        return *words << (64 - end_bit) >> (64 - bit_width);

    // Recursive split: low part from this word, high part from next
    int64_t low  = bitfield_extract(words, bit_offset, 64 - bit_offset);
    int64_t high = words[1] << (128 - end_bit) >> (64 - (end_bit - 64));
    return low | (high << (64 - bit_offset));
}

The write function (sub_4685B0) similarly handles multi-word spans with an iterative loop that processes one 64-bit word at a time, shifting the value right after each partial write.

Why Sub-Byte Relocations Exist

The 8-bit relocations serve data sections where 64-bit addresses are stored in structures with byte-aligned fields. Rather than requiring the compiler to emit aligned 64-bit relocations, the linker can patch individual bytes. This is particularly useful for:

  • Descriptor tables: Hardware descriptor entries that pack multiple fields into bytes
  • Jump tables: Array entries stored as byte-packed offsets
  • Constant buffer initialization: Per-byte patching of constant bank data
  • Debug section data: DWARF entries with byte-level address references

The PC-relative byte-level family (R_MERCURY_PROG_REL8_*) is unique to Mercury and has no R_CUDA counterpart. It enables byte-by-byte assembly of PC-relative offsets in data tables, which previous SASS generations did not support.

Descriptor Table Structure

The Mercury descriptor table at off_1D3CBE0 has the same 64-byte-per-entry format as the CUDA table at off_1D3DBE0. Each entry encodes up to three patching actions:

Entry (64 bytes):
  +0   Header (12 bytes)
       +0   uint32_t  field_0;      // extraction spec for --preserve-relocs
       +4   uint32_t  field_1;      // extraction spec
       +8   uint32_t  field_2;      // extraction spec / action count hint
  +12  Action 0 (16 bytes)
       +12  uint32_t  bit_offset;   // start bit in instruction/data word
       +16  uint32_t  bit_width;    // number of bits to patch
       +20  uint32_t  action_type;  // operation code (see table below)
       +24  uint32_t  reserved;     // flags / padding
  +28  Action 1 (16 bytes)
       +28  uint32_t  bit_offset;
       +32  uint32_t  bit_width;
       +36  uint32_t  action_type;
       +40  uint32_t  reserved;
  +44  Action 2 (16 bytes)
       +44  uint32_t  bit_offset;
       +48  uint32_t  bit_width;
       +52  uint32_t  action_type;
       +56  uint32_t  reserved;
  +60  Sentinel (4 bytes, marks end of action array)

The application engine (sub_468760) indexes into the table and sets up its action pointer and sentinel:

descriptor_base = table + (type_index << 6);  // type_index * 64
action_ptr = descriptor_base + 12;             // first action at byte +12
end_ptr    = descriptor_base + 60;             // sentinel at byte +60

Action Types Used by Mercury Descriptors

The action types are identical between CUDA and Mercury -- the engine is shared. Mercury descriptors use the following subset:

CodeNameComputationMercury Usage
0x00endSkip slot, terminate at sentinelAll types (terminates unused slots)
0x01abs_fullS + A -- store all bitsABS64, ABS32, G64, FUNC_DESC_64
0x06abs_lo(S + A) & mask -- low bitsABS32_LO, UNIFIED32_LO
0x07abs_hi((S + A) >> 32) & mask -- high bitsABS32_HI, UNIFIED32_HI
0x09abs_shifted(S + A) >> 2 -- 4-byte alignedPROG_REL32 (shifted branch offset)
0x10pc_rel(int32_t)(S + A) - PCPROG_REL32, PROG_REL64
0x12abs_full (alias)Same as 0x01UNIFIED, UNIFIED_32
0x13clearZero the bit-fieldUNUSED_CLEAR64
0x14clear (alias)Same as 0x13UNUSED_CLEAR64 (secondary action)
0x16--0x1Dmasked_shift_0..7(value & mask) >> shift8_*, G8_*, FUNC_DESC_8_*
0x2Eabs_full (alias)Same as 0x01UNIFIED_8_*
0x2F--0x36masked_shift_8..15(value & mask) >> shiftUNIFIED_8_*, PROG_REL8_*
0x37abs_lo (alias)Same as 0x06PROG_REL32_LO, ABS_PROG_REL32_LO
0x38abs_hi (alias)Same as 0x07PROG_REL32_HI, ABS_PROG_REL32_HI

Mercury does not use action types 0x08 (abs_size), 0x0A (sec_type_lo), or 0x0B (sec_type_hi), which are R_CUDA-specific operations for section-type patching and symbol-size computation.

Difference from R_CUDA Descriptors

Although the table format is identical, the descriptor contents differ from R_CUDA because Mercury's 128-bit instruction words have different field layouts. A relocation that places a 32-bit absolute address at bit position 20 in a 64-bit CUDA instruction will place it at a different bit position in a 128-bit Mercury instruction -- and this difference is encoded in the descriptor's bit_offset field, not in the type name.

The shared application engine sub_468760 is type-agnostic: it simply reads the descriptor, executes the action sequence, and patches the bit fields. The distinction between CUDA and Mercury is entirely in which descriptor table is selected at dispatch time.

Mercury Relocations and FNLZR Finalization

Mercury relocations interact with the FNLZR (Finalizer) subsystem at two critical points in the pipeline.

In pre-link mode (sub_4275C0 with a5 == 0), FNLZR processes individual cubins before they enter the merge phase. Mercury relocations at this stage are stored in .nv.merc.rela sections and reference the Mercury-private symbol table. The pre-link pass:

  1. Validates the ELF class byte is 0x41 (Mercury)
  2. Checks that finalization has not already been applied ((flags >> 2) & 1 must be 0)
  3. Passes the cubin through the sub_4748F0 engine, which may rewrite relocations during opex expansion
  4. Outputs a transformed cubin with potentially modified relocation entries

Post-Link Mode: Final Application

In post-link mode (sub_4275C0 with a5 == 1), FNLZR runs after the merge phase has applied all standard relocations. The post-link path:

  1. Confirms the SASS-present or capmerc bit is set in e_flags
  2. Invokes sub_4748F0 for the capmerc-to-SASS transformation
  3. The engine re-resolves any remaining Mercury relocations against the final symbol table
  4. Outputs the final SASS binary with all relocations fully applied

The FNLZR diagnostic output shows the relocation interaction:

FNLZR: Input ELF: <filename>
FNLZR: Post-Link Mode
FNLZR: Flags [ 1 | 0 ]           // capmerc=1, sass-only=0
FNLZR: Starting <filename>
  ... mercury relocation application ...
FNLZR: Ending <filename>

Merge Phase: Section Skipping

During the merge phase (sub_45E7D0), Mercury-specific sections are conditionally skipped and deferred to FNLZR. The function emits a debug trace when skipping:

// sub_45E7D0 at 0x45F624
fprintf(stderr, "skip mercury section %i\n", section_index);

This skip logic is gated by the 0x10000000 flag in the section's sh_flags field, which marks Mercury-specific sections that should not be merged by the standard section merger.

Relocation Phase: Dead Code and YIELD Handling

The relocation phase (sub_469D60) contains Mercury-specific dead-code and YIELD-conversion logic:

  1. Dead function elimination: When a relocation targets a dead function (binding type 1 with no name), the reloc type is zeroed and a debug trace is emitted:

    fprintf(stderr, "ignore reloc on dead func %s\n", symbol_name);
    rela_entry.type = 0;   // convert to R_MERCURY_NONE
    
  2. YIELD instruction conversion: Relocation types 68--69 (0x10044--0x10045, which map to R_CUDA YIELD types when using Mercury indexing) trigger special handling. When the forward-progress requirement flag (context + 94) is set, the linker ignores the YIELD-to-NOP conversion relocation:

    fwrite("Ignoring the reloc to convert YIELD to NOP due to forward progress requirement.\n",
           1, 0x50, stderr);
    

Capmerc Self-Check and Uplift

Self-Check Infrastructure

nvlink includes a self-check mode activated by --self-check (registered in sub_4AC380 at 0x4AC380). When active, the linker verifies that Mercury relocation application produces correct results by comparing the patched output against expected values. The diagnostic messages and their binary addresses:

String AddressMessageTriggered By
0x1D41EC8"Self check for capsule mercury (capmerc)"sub_4AC380 (option registration)
0x2458F38"Self check for capsule mercury text section failed"Self-check comparison, .text mismatch
0x2458F70"Self check for capsule mercury debug section failed"Self-check comparison, debug section mismatch
0x2458FA8"Self check for capsule mercury relocation section failed"Self-check comparison, .rela section mismatch
0x2458FE8"Invalid elf provided for mercury uplift."Bad ELF input for Mercury conversion
0x1F44288"Failure of '%s' section in self-check for capsule mercury..."Generic section failure (Jira MERCSW-125)

The self-check strings at 0x2458F38--0x2458FE8 are organized as a pointer array at 0x24590A0--0x24590B8, suggesting a table-driven diagnostic approach where section indices map to error messages.

Mercury Uplift

The "mercury uplift" path converts non-Mercury binaries to Mercury format. The sub_4AC380 function also registers related options:

CLI FlagDescription
--binary-kindmercury, capmerc, or sass -- specifies target binary format
--cap-mercForce Capsule Mercury generation
--self-checkEnable self-check verification
--out-sassGenerate reconstituted SASS through self-check
--compile-as-at-entry-patchCompile patch as at-entry fragment
--opportunistic-finalization-lvlControl finalization aggressiveness (default 0)

Third Name Table at 0x2459160

The R_MERCURY_NONE string at 0x1D35A17 has three xrefs: the primary name table (0x1D371E0), the descriptor table (0x1D3CBE0), and a third pointer at 0x2459160. This third reference is in the capmerc uplift/self-check code region and represents a separate copy of the name table used by the Mercury-to-SASS reconstitution path for relocation name lookup during verification.

R_MERCURY vs R_CUDA Comparison

The following table maps each R_MERCURY type to its closest R_CUDA equivalent. Mercury's type set is a strict subset of R_CUDA's semantic categories, omitting instruction-encoding-specific types.

R_MERCURYIndexR_CUDA EquivalentNotes
R_MERCURY_NONE0R_CUDA_NONEIdentical semantics
R_MERCURY_G641R_CUDA_G64Identical semantics
R_MERCURY_ABS642R_CUDA_64R_CUDA uses plain _64
R_MERCURY_ABS323R_CUDA_32R_CUDA uses plain _32
R_MERCURY_ABS164R_CUDA_ABS16_*Mercury: position-independent; CUDA: per-bit-position
R_MERCURY_ABS32_LO5R_CUDA_ABS32_LO_*Mercury: 1 type; CUDA: 4 types (per bit-position)
R_MERCURY_ABS32_HI6R_CUDA_ABS32_HI_*Mercury: 1 type; CUDA: 4 types
R_MERCURY_PROG_REL647--No direct CUDA equivalent
R_MERCURY_PROG_REL328R_CUDA_PCREL_IMM24_*CUDA uses 24-bit; Mercury uses 32-bit
R_MERCURY_PROG_REL32_LO9--Mercury-specific
R_MERCURY_PROG_REL32_HI10--Mercury-specific
R_MERCURY_TEX_HEADER_INDEX11R_CUDA_TEX_HEADER_INDEXIdentical semantics
R_MERCURY_SAMP_HEADER_INDEX12R_CUDA_SAMP_HEADER_INDEXIdentical semantics
R_MERCURY_SURF_HEADER_INDEX13R_CUDA_SURF_HEADER_INDEXIdentical semantics
R_MERCURY_UNUSED_CLEAR6414R_CUDA_UNUSED_CLEAR64Identical semantics
R_MERCURY_FUNC_DESC_6415R_CUDA_FUNC_DESC_64Identical semantics
R_MERCURY_8_*16--23R_CUDA_8_*Identical semantics
R_MERCURY_G8_*24--31R_CUDA_G8_*Identical semantics
R_MERCURY_FUNC_DESC_8_*32--39R_CUDA_FUNC_DESC_8_*Identical semantics
R_MERCURY_ABS_PROG_REL32_LO40--Mercury-specific
R_MERCURY_ABS_PROG_REL32_HI41--Mercury-specific
R_MERCURY_PROG_REL8_*42--49--Mercury-specific (no CUDA byte-level PC-rel)
R_MERCURY_UNIFIED50R_CUDA_UNIFIEDIdentical semantics
R_MERCURY_UNIFIED_3251R_CUDA_UNIFIED_32Identical semantics
R_MERCURY_UNIFIED_8_*52--59R_CUDA_UNIFIED_8_*Identical semantics
R_MERCURY_ABS_PROG_REL3260--Mercury-specific
R_MERCURY_ABS_PROG_REL6461--Mercury-specific
R_MERCURY_UNIFIED32_LO62R_CUDA_UNIFIED32_LO_32CUDA includes bit-position
R_MERCURY_UNIFIED32_HI63R_CUDA_UNIFIED32_HI_32CUDA includes bit-position
R_MERCURY_NONE_LAST64R_CUDA_NONE_LASTIdentical semantics

Design Philosophy: Position-Independent Types

The central design difference between R_MERCURY and R_CUDA relocations is the elimination of bit-position encoding from the type name. R_CUDA has 119 types because the same semantic operation (e.g., "patch a 32-bit absolute address into an instruction") requires separate type numbers for each instruction generation's bit-field layout. Mercury consolidates these into single types, pushing the bit-position information into the descriptor table.

This design has several consequences:

  1. Type count reduction: 65 types vs 119, despite adding 15 new categories (PC-rel 64-bit, byte-level PC-rel, absolute PC-rel)
  2. Forward compatibility: New Mercury instruction encodings with different bit-field positions need only new descriptor table entries, not new type numbers
  3. Simpler compiler output: The compiler emits generic relocation types and the linker's descriptor table provides the architecture-specific bit-field mapping
  4. Shared engine: The sub_468760 engine works identically for both CUDA and Mercury because the bit-field information is always in the descriptor, never in the type-based code path

Categories Present in R_CUDA but Absent from R_MERCURY

R_MERCURY omits the following R_CUDA relocation categories entirely:

R_CUDA CategoryCountReason for Omission
R_CUDA_ABS*_<bitpos> (instruction)25Mercury uses position-independent R_MERCURY_ABS*
R_CUDA_PCREL_IMM24_*2Replaced by R_MERCURY_PROG_REL32/64
R_CUDA_CONST_FIELD*10Constant bank references handled differently in Mercury
R_CUDA_TEX_BINDLESSOFF* / R_CUDA_BINDLESSOFF*7Bindless mechanism redesigned for Mercury ISA
R_CUDA_TEX_SLOT / SAMP_SLOT / SURF_SLOT3Mercury uses header index only, no slot numbers
R_CUDA_SURF_HW_DESC / SURF_HW_SW_DESC2Hardware descriptor types eliminated
R_CUDA_INSTRUCTION64 / INSTRUCTION1282Whole-instruction replacement not used in Mercury
R_CUDA_YIELD_*2YIELD conversion handled differently (or absent)
R_CUDA_UNUSED_CLEAR321Mercury only has 64-bit clear
R_CUDA_QUERY_DESC21_371Query descriptor mechanism absent
R_CUDA_6_31 / R_CUDA_2_472Narrow-field types not needed
R_CUDA_G321Mercury uses 64-bit global only
R_CUDA_FUNC_DESC_32 / FUNC_DESC32_*10Mercury uses 64-bit descriptor + byte-level only

Categories Present in R_MERCURY but Absent from R_CUDA

R_MERCURY CategoryCountDescription
R_MERCURY_PROG_REL64164-bit PC-relative (CUDA has only 24-bit)
R_MERCURY_PROG_REL32_LO/HI2Split PC-relative (no CUDA equivalent)
R_MERCURY_PROG_REL8_*8Byte-level PC-relative (entirely new)
R_MERCURY_ABS_PROG_REL*4Hybrid absolute/PC-relative (entirely new)

Most Common Mercury Relocations

Based on the type semantics and their positions in the table, the following relocations appear most frequently in typical Mercury binaries:

Kernel Code Relocations

RelocationTypical UseFrequency
R_MERCURY_PROG_REL32 (8)Branch and call targets within a kernelVery high -- every BRA/CALL/BRX instruction
R_MERCURY_ABS32 (3)Immediate constant addresses in instruction fieldsHigh -- memory access instructions
R_MERCURY_ABS32_LO / ABS32_HI (5, 6)Split 32-bit address loaded in two instructions (LO/HI pair)High -- MOV32I + IADD3.X address materialization
R_MERCURY_G64 (1)Global variable referencesMedium -- global load/store
R_MERCURY_TEX_HEADER_INDEX (11)Texture fetch instructionsMedium -- in texture-heavy kernels
R_MERCURY_FUNC_DESC_64 (15)Indirect call targets and virtual function pointersLow -- only in kernels using function pointers

Data Section Relocations

RelocationTypical UseFrequency
R_MERCURY_8_0 through 8_56 (16--23)Patching individual bytes in constant bank initializersHigh -- one per byte of initialized pointer
R_MERCURY_G8_0 through G8_56 (24--31)Global address bytes in descriptor tablesMedium
R_MERCURY_FUNC_DESC_8_0 through 8_56 (32--39)Function pointer bytes in vtable-like structuresLow
R_MERCURY_ABS64 (2)Full 64-bit address in data sectionMedium
R_MERCURY_UNUSED_CLEAR64 (14)Zeroing unused descriptor entries after mergeMedium

Unified Table Relocations

RelocationTypical UseFrequency
R_MERCURY_UNIFIED_8_0 through 8_56 (52--59)UFT/UDT table entries for dynamic parallelismLow -- only with CDP
R_MERCURY_UNIFIED_32 (51)32-bit UFT/UDT offsetLow
R_MERCURY_UNIFIED (50)Marker resolved to NONE before applicationLow

Worked Example: Branch Relocation

A Mercury BRA instruction targeting function _Z6kernelPf generates a R_MERCURY_PROG_REL32 relocation:

Section .rela.text, offset 0x180:
  r_offset = 0x00000050    (byte offset of the BRA instruction in .text)
  r_info   = 0x0003 10008  (symbol index 3, type 0x10008 = R_MERCURY_PROG_REL32)
  r_addend = 0x00000000

The linker resolves:
  S = symbol_value(_Z6kernelPf) = 0x00000200
  A = r_addend = 0
  PC = section_base + r_offset = 0x00000050
  result = (S + A) - PC = 0x200 - 0x50 = 0x1B0

The descriptor for type 8 (PROG_REL32) specifies:
  Action 0: bit_offset=X, bit_width=32, action_type=0x10 (pc_rel)
  The engine writes 0x1B0 into bits [X:X+32) of the 128-bit instruction at offset 0x50.

Worked Example: Byte-Level Data Patching

A global variable address 0x00007FFF12345678 must be written byte-by-byte into a descriptor table entry at offset 0x40:

8 relocation entries, one per byte:
  R_MERCURY_G8_0   at offset 0x40: writes 0x78 (byte 0)
  R_MERCURY_G8_8   at offset 0x41: writes 0x56 (byte 1)
  R_MERCURY_G8_16  at offset 0x42: writes 0x34 (byte 2)
  R_MERCURY_G8_24  at offset 0x43: writes 0x12 (byte 3)
  R_MERCURY_G8_32  at offset 0x44: writes 0xFF (byte 4)
  R_MERCURY_G8_40  at offset 0x45: writes 0x7F (byte 5)
  R_MERCURY_G8_48  at offset 0x46: writes 0x00 (byte 6)
  R_MERCURY_G8_56  at offset 0x47: writes 0x00 (byte 7)

Each relocation uses a masked_shift action:
  G8_0 -> action_type=0x16, mask=0x00000000000000FF, shift=0
  G8_8 -> action_type=0x17, mask=0x000000000000FF00, shift=8
  ...
  G8_56 -> action_type=0x1D, mask=0xFF00000000000000, shift=56

ELF Attribute Relocations

Mercury attribute relocations use the same 0x10000 offset mechanism within the attribute relocation table at off_1D371E0. When the relocation engine encounters a type >= 0x10000 in an attribute section (.nv.info.*), it subtracts 0x10000 and indexes into this table.

The attribute table has 65 entries (indices 0--64), validated with the limit check type_index >= 0x41 in sub_42F6C0. The validation function at sub_42F760 handles attribute-specific compatibility with a three-way dispatch:

// sub_42F760 at 0x42F760 -- attribute validation
if (type_index > 0x60)       // > 96 -- guard against out-of-bounds
    error("unknown attribute");

if (dword_1D37D68[4 * type_index] > target_arch_class) {
    switch (dword_1D37D68[4 * type_index + 1]) {   // usage field
    case 0: warning("Attribute %s not supported on %s", name, class_name); break;
    case 1: error("Attribute %s not supported on %s", name, class_name); break;
    case 2: /* silent ignore */ break;
    default: error("unknown usage"); break;
    }
}

The EIATTR types specific to Mercury are:

AttributeAddressDescription
EIATTR_MERCURY_ISA_VERSION0x1D36F31Mercury ISA version embedded in .nv.info
EIATTR_MERCURY_FINALIZER_OPTIONS0x1D37170Options passed to the Mercury finalizer (fnlzr)

And the compatibility attributes:

AttributeAddressDescription
EICOMPAT_ATTR_MERCURY_ISA_MAJOR_MINOR_VERSION0x245EF08Major.minor ISA version for compatibility checking
EICOMPAT_ATTR_MERCURY_ISA_PATCH_VERSION0x245EF38Patch-level ISA version

These attributes are not relocation types but are processed alongside relocations during the .nv.info section handling phase. The EIATTR_MERCURY_FINALIZER_OPTIONS attribute is particularly important because it controls how the FNLZR processes relocations -- it can enable or disable specific finalization passes that affect relocation resolution.

Summary Table by Category

CategoryIndicesCountComputationEngine Action Types
Sentinel0, 642--0x00 (end)
Global data11S + A0x01 (abs_full)
Absolute data2--65S + A (full/lo/hi)0x01, 0x06, 0x07
PC-relative7--104(S + A) - PC (full/lo/hi)0x10 (pc_rel), 0x37, 0x38
Texture/sampler/surface11--133Header index lookup0x01 with index computation
Clear141Write zeros0x13/0x14 (clear)
Function descriptor151S + A0x01 (abs_full)
Byte-level16--238byte_n(S + A)0x16--0x1D (masked_shift)
Global byte-level24--318byte_n(S + A)0x16--0x1D (masked_shift)
Func desc byte-level32--398byte_n(S + A)0x16--0x1D (masked_shift)
Abs PC-relative (split)40--412|S + A - PC| lo/hi0x37, 0x38
PC-relative byte-level42--498byte_n((S + A) - PC)0x2F--0x36 (masked_shift)
Unified table50--59, 62--6312S + A (full/lo/hi/byte)0x12, 0x2E, 0x06, 0x07
Abs PC-relative (full)60--612|S + A - PC| (32/64)0x01 with abs computation
Total--65----

Function Addresses

FunctionAddressSizeDescription
sub_4687600x46876014,322 bytesRelocation application engine (shared CUDA/Mercury)
sub_469D600x469D60~4,800 bytesRelocation phase dispatcher (table selection, symbol resolution)
sub_469B500x469B50~600 bytesRelocation record builder (linked-list entry creation)
sub_42F6C00x42F6C0916 bytesRelocation type validation (shared CUDA/Mercury)
sub_42F7600x42F7601,367 bytesAttribute relocation validation (three-way dispatch)
sub_4686700x468670882 bytesBitfield extraction helper (recursive multi-word)
sub_4685B00x4685B0961 bytesBitfield write helper (multi-word splice)
sub_4683D00x4683D0536 bytesDescriptor table swap (save/restore table pointers)
sub_4275C00x4275C03,989 bytesFNLZR front-end dispatcher (pre/post-link mode)
sub_4748F00x4748F048,730 bytesFNLZR core engine (10-phase pipeline)
sub_4AC3800x4AC380~2,000 bytesCapmerc CLI option registration and self-check setup
sub_45E7D00x45E7D089,156 bytesMerge function (Mercury section skip logic)

Confidence Assessment

ClaimRatingEvidence
65 unique R_MERCURY type names (indices 0--64)HIGH71 R_MERCURY strings in nvlink_strings.json (65 unique names + 2 trailing-space duplicates + 4 EIATTR/EICOMPAT attributes). Exact count verified by string scan.
Name table at off_1D371E0 (65 entries, addr range 0x1D35A17--0x1D35F4C)HIGHAll 65 type name strings verified at addresses within stated range. Xref from R_MERCURY_NONE at 0x1D35A17 confirms 0x1D371E0 is a pointer into this table.
Descriptor table at off_1D3CBE0 (65 entries, 64 bytes each = 4,160 bytes)HIGHTable address off_1D3CBE0 verified from decompiled sub_469D60 (line 202: v152 = &off_1D3CBE0) and sub_469B50 (line 76: v35 = &off_1D3CBE0). 64-byte entry size confirmed from type_index << 6 indexing in sub_468760.
ELF type encoding: table index + 0x10000HIGHVerified from decompiled sub_469D60 at lines 197--203: v148 = v9 - 0x10000 with guard v8 <= 0x10000 producing fatal("unexpected reloc").
ELF class byte 0x41 ('A') distinguishes Mercury from CUDAHIGHVerified from decompiled sub_469D60 (line 190: *(_BYTE *)(v2 + 7) != 65), sub_469B50 (line 40: *(_BYTE *)(a1 + 7) == 65), and sub_4275C0 (line 40: *(_BYTE *)(v14 + 7) != 65).
Application engine sub_468760 shared with R_CUDAHIGHDecompiled file sub_468760_0x468760.c (14,322 bytes). Called from sub_469D60 line 414 with both off_1D3CBE0 (Mercury) and off_1D3DBE0 (CUDA) tables.
Relocation phase dispatcher sub_469D60 performs table selectionHIGHDecompiled code explicitly shows Mercury path (lines 196--208) selecting off_1D3CBE0 and subtracting 0x10000, vs CUDA path (lines 212--214) selecting off_1D3DBE0.
Record builder sub_469B50 validates and creates reloc entriesHIGHDecompiled code at lines 40--76 shows ELF class check, sub_42F6C0 call, Mercury index normalization (v9 - 0x10000), and descriptor table action-type inspection.
Descriptor format: 12-byte header + 3x 16-byte actions + 4-byte sentinelHIGH64-byte entry size verified from << 6 shift in sub_468760. Action pointer starts at +12 (line 130: v15 = (v12 + 12)) and sentinel at +60 (line 132: v100 = (v12 + 60)). Three action slots of 4 x uint32 = 16 bytes each confirmed by v15 += 4 increment per action.
Action type dispatch (0x00, 0x01, 0x06, 0x07, 0x10, 0x13, 0x16--0x1D, etc.)HIGHAll action type codes verified from the switch statement in sub_468760 (lines 137--580). Case labels match the documented table exactly.
R_MERCURY vs R_CUDA comparison tableHIGHType-by-type comparison verified from both name tables in nvlink_strings.json. Semantic equivalence claims based on identical descriptor action patterns for matching types.
Categories omitted from R_MERCURY (25 instruction-specific, CONST_FIELD, bindless, etc.)HIGHAbsence verified by complete enumeration of all 65 R_MERCURY names. No R_MERCURY_CONST_FIELD, R_MERCURY_BINDLESS, R_MERCURY_INSTRUCTION64/128, etc. exist in the string table.
Mercury-specific types: PROG_REL64, PROG_REL32_LO/HI, PROG_REL8_, ABS_PROG_RELHIGHAll type names verified in nvlink_strings.json. No corresponding R_CUDA_PROG_REL64, R_CUDA_PROG_REL8_*, or R_CUDA_ABS_PROG_REL* exist.
Trailing-space variants at 0x1D3CB71 and 0x1D3CB88HIGHTwo strings with trailing spaces verified in string table at exact addresses. Located within the descriptor table region (between off_1D3CBE0 and off_1D3DBE0).
UFT_OFFSET symbol string comparison at sub_469D60 lines 377--395HIGHDecompiled code shows 13-byte strncmp against "__UFT_OFFSET" with debug trace "ignore reloc on UFT_OFFSET\n".
PC-relative same-section validation at sub_469D60 line 409--410HIGHDecompiled code: descriptor[5] == 16 (action_type pc_rel) triggers "PC relative branch address should be in the same section".
Dead-function reloc elimination traceHIGHDecompiled code at sub_469D60 line 349: fprintf(stderr, "ignore reloc on dead func %s\n", ...).
YIELD conversion suppression for forward-progressHIGHDecompiled code at sub_469D60 lines 495--504: checks context + 94 flag, emits 80-byte YIELD message.
Self-check strings at 0x2458F38--0x2458FE8HIGHAll 4 self-check strings verified at exact addresses in nvlink_strings.json with xrefs to 0x24590A0--0x24590B8 pointer array.
Third name table copy at 0x2459160HIGHR_MERCURY_NONE xref list includes 0x2459160 alongside 0x1D371E0 and 0x1D3CBE0. Located in capmerc uplift code region.
MERCSW-125 Jira referenceHIGHString at 0x1F44288 contains full text including "Jira confluence page 'MERCSW-125'".
Attribute validation function sub_42F760 three-way dispatchHIGHDecompiled code shows dword_1D37D68[4 * a1 + 1] value 0=warning, 1=error, 2=ignore, with guard a1 > 0x60.
sub_4AC380 capmerc option registrationHIGHDecompiled code shows sub_42F130 calls registering --self-check, --binary-kind, --cap-merc, --out-sass, --compile-as-at-entry-patch, --opportunistic-finalization-lvl.
R_MERCURY_UNIFIED32_LO/HI added later than main unified blockLOWInferred from non-contiguous index placement (indices 62--63 vs 50--59). No version history available to confirm chronology.
R_MERCURY_ABS_PROG_REL32_LO/HI added before full-width variantsLOWSame reasoning: non-contiguous indices (40--41 vs 60--61). Chronological ordering is speculative.
Action types not used by Mercury (0x08, 0x0A, 0x0B)MEDIUMInferred from the absence of CONST_FIELD and symbol-size relocation types in R_MERCURY. Not verified by exhaustive descriptor table dumping.

Cross-References

Sibling Wikis