Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Function Map

Address-to-identity lookup for ~160 key functions across 20 subsystems in nvlink v13.0.88 (~37 MB, 40,532 total functions). Confidence: VERY HIGH = string/symbol evidence, HIGH = strong structural evidence, MEDIUM = inferred from context/callgraph.

Binary: /usr/local/cuda-13.0/bin/nvlink SHA256: see versions Entry point: 0x409800 (main)

Binary composition: Approximately 95% of the binary is embedded ptxas compiler backend (ISel, register allocation, scheduling, instruction encoding). Only about 5% is linker logic (~1,900 functions in 0x400000--0x4C0000). Of the 40,532 total functions, roughly 38,000 belong to the statically linked ptxas compiler -- these are documented in the ptxas wiki function-map. This page focuses on the ~2,000 linker-specific and infrastructure functions, plus the embedded-ptxas interfaces visible from the linker side.

See Binary Layout for the full address-range breakdown and composition table.


Top 20 Most-Called Functions

Functions with the highest cross-reference count in the binary. These form the backbone of every subsystem.

#AddressDecompiledProposed NameCallersSizeRole
10x530FB0sub_530FB0IRNode_GetOperand31,39916BReturn pointer to operand at index (32-byte stride)
20xA49150sub_A49150NVInst_getOperandField30,76860BQuery instruction attribute by field ID; dispatches to sub_A7DE70 + sub_A709F0
30x4307C0sub_4307C0arena_alloc~10K+10.7KBThread-safe arena allocator (small-block free-list + large-block pages)
40x431000sub_431000arena_free~10K+4.7KBArena deallocator, returns blocks to size-class free-lists
50x530E90sub_530E90IROperand_IsRegister~5K+16Breturn type_tag == 2
60x530FC0sub_530FC0IRNode_GetNumSrcOperands~5K+16Btotal_ops + 1 - first_src_index
70x530FD0sub_530FD0IRNode_GetNumDstOperands~5K+16Breturn *(a1 + 92)
80x530EA0sub_530EA0IROperand_IsImmediate~3K+16Breturn type_tag == 1
90x530E80sub_530E80IRNode_GetRegClass~3K+16BIdentity function / unsigned int extract
100xA50D10sub_A50D10encode_GPR~3K+tinyEncode register number for destination field
110x467460sub_467460error_emit~2K+~2KBVariadic error emission (dispatches to sub_467A70)
120x448360sub_448360elfw_get_section_header~2K+<2KBSection header accessor
130x44F410sub_44F410arena_get_metadata~2K+<2KBLook up allocation metadata for a pointer
140x45CAC0sub_45CAC0oom_handler~1K+tinyOut-of-memory handler, calls abort path
150x45CAE0sub_45CAE0arena_assert~1K+tinyArena validity assertion
160x4C28B0sub_4C28B0setBitfield~1K+smallCore bitfield insertion into instruction word
170x50C790sub_50C790getReuse~1K+smallRead 1-bit reuse flag from encoded instruction
180x530F80sub_530F80IRNode_GetDataType~1K+16BIdentity function for data type field at +20
190x4489C0sub_4489C0hash_table_create~500+smallCreate hash table for option/symbol lookup
200x464460sub_464460linked_list_append~500+smallAppend node to singly-linked list

1. Entry & CLI

Main Program Flow

AddressDecompiledProposed NameSizeConfidenceDescription
0x409800mainmain58KBVERY HIGHComplete nvlink entry point. Parses options, dispatches by input type (cubin/ptx/fatbin/nvvm/ltoir/bc/.o/.so/.a), drives merge/layout/relocate/finalize/write pipeline, handles Mercury post-link, host linker script generation.
0x427AE0sub_427AE0nvlink_parse_options30KBVERY HIGHRegisters ~60 CLI options via sub_42F130, extracts all into globals. Validates arch (sm > 19), Mercury mode (sm > 99), LTO constraints. String evidence: "suppress-stack-size-warning", "suppress-arch-warning".
0x4275C0sub_4275C0post_link_transform4KBVERY HIGHFNLZR (Finalizer) entry for Mercury/SASS. String evidence: "FNLZR: Input ELF: %s", "FNLZR: Pre-Link Mode".
0x45CCD0sub_45CCD0timing_starttinyHIGHBegin profiling timer for phase tracing.
0x4279C0sub_4279C0trace_phasetinyHIGHDebug trace with phase names: "init", "read", "merge", "layout", "relocate", "finalize", "write".

Details: Pipeline Entry, Pipeline Overview, Mode Dispatch

Command-Line Option Parsing

AddressDecompiledProposed NameSizeConfidenceDescription
0x42DFE0sub_42DFE0option_parser_create4.5KBHIGHAllocates 56-byte parser struct, creates two hash tables for option lookup.
0x42F130sub_42F130option_register4.9KBHIGHRegisters single option (120-byte entry): name, short name, type (1=bool/2=string/4=int/0=file), multiplicity, default, help. Called ~60 times from sub_427AE0.
0x42E5A0sub_42E5A0option_parse_argv9.5KBHIGHIterates argv, matches against registered options, handles --, =, response files @file.
0x42E390sub_42E390option_get_value2.9KBHIGHExtracts parsed option value (1/4/8 byte) into destination variable. Called ~80 times.
0x42D700sub_42D700option_format_help5.6KBMEDIUMFormats single option help entry with defaults, keywords, allowed values.
0x42DBC0sub_42DBC0option_validate_value5.1KBMEDIUMValidates option value against type constraints ("32-bit integer", "64-bit hex", etc.).

Details: CLI Options, CLI Flags

Static Initialization (Constructors)

AddressDecompiledProposed NameConfidenceDescription
0x40C4F0ctor_001ctor_thread_infraVERY HIGHpthread_key_create, mutex init, scheduler priority range. Sets up TLS infrastructure.
0x40C5C0ctor_002ctor_002HIGHAdditional initialization (registered after ctor_001).
0x410830ctor_003ctor_003HIGHRegisters atexit handler via __cxa_atexit.
0x410850ctor_004ctor_004HIGHPaired with ctor_003.
0x412750ctor_005ctor_knob_tableHIGHInitializes knob storage array via sub_44F670.
0x412790ctor_006ctor_006HIGHPaired with ctor_005.
0x426260ctor_008ctor_version_constantsVERY HIGHSets version constants: qword_2A74108 = 0x60000000, qword_2A74100 = 0x60000001.
0x426280ctor_009ctor_009HIGHAdditional version/capability setup.
0x4262B0ctor_010ctor_010HIGHAdditional initialization.
0x426330ctor_011ctor_011HIGHAdditional initialization.

Details: Pipeline Entry, Binary Layout


2. Pipeline Phases

Merge Engine (init + read + merge)

AddressDecompiledProposed NameSizeConfidenceDescription
0x45E7D0sub_45E7D0merge_elf89KBVERY HIGHHeart of the linker. Iterates all sections of input ELF, merges symbols/relocations/data. Handles .nv.global, .nv.shared, .nv.constant, .nv.info, DWARF debug. 450+ locals.
0x45D180sub_45D180merge_weak_function26.8KBHIGHResolves weak function conflicts by comparing register counts and PTX versions.
0x426570sub_426570validate_arch_and_merge7.4KBHIGHValidates cubin architecture matches target ("compute_%d%c", "sm_%d%c").
0x432B10sub_432B10merge_overlapping_global11.7KBHIGHValidates overlapping symbol definitions in .nv.global contain identical data.
0x437E20sub_437E20merge_overlapping_local11.6KBHIGHSame pattern for .nv.local.* sections.
0x4343C0sub_4343C0merge_overlapping_constant11.8KBHIGHSame pattern for .nv.constant* sections.
0x4339A0sub_4339A0optimize_constant_dedup13.2KBHIGHDeduplicates constant values: "found duplicate value 0x%x, alias %s to %s". Handles 32-bit and 64-bit.
0x438640sub_438640merge_constant_bank_data4.0KBHIGHMerges data into constant memory banks. Validates bank SHT not CUDA_CONSTANT_?.

Details: Pipeline Merge, Section Merging, Weak Symbols, Data Layout Optimization

Layout Phase

AddressDecompiledProposed NameSizeConfidenceDescription
0x439830sub_439830shared_memory_layout66KBVERY HIGHAllocates and lays out all shared memory: global, extern, local, reserved. Computes overlapping set analysis. Handles .nv.reservedSmem.*.
0x436BD0sub_436BD0shared_memory_optimizer15.7KBHIGHBuilds interference graph, groups non-overlapping shared variables to reduce total shared memory usage.
0x438DD0sub_438DD0process_bindless_references12.8KBHIGHHandles bindless texture/surface relocations. Creates $NVLINKBINDLESSOFF_%s synthetic symbols.

Details: Pipeline Layout, Bindless Relocations

Relocate Phase

AddressDecompiledProposed NameSizeConfidenceDescription
0x469D60sub_469D60apply_relocations26.6KBVERY HIGHComplete relocation resolution. Handles __UFT_OFFSET, __UDT_OFFSET, __UFT_CANONICAL, __UDT, __UFT. Processes .nv.resolvedrela.
0x46ADC0sub_46ADC0emit_resolved_relocations11.5KBHIGHCreates .nv.resolvedrela section when --preserve-relocs.
0x459640sub_459640reloc_vtable_create16.1KBHIGHCreates 632-byte vtable with ~70 handler slots, dispatched per arch generation (sm30..sm100+).

Details: Pipeline Relocate, R_CUDA Relocations, R_MERCURY Catalog

Finalize Phase

AddressDecompiledProposed NameSizeConfidenceDescription
0x445000sub_445000finalize_elf56KBVERY HIGHFinal relocation application and ELF finalization. Architecture-specific relocation encodings, symbol address resolution, final section content generation.
0x451D80sub_451D80compute_entry_properties98KBHIGHLargest function in the linker core. Computes per-kernel register counts, stack sizes, barrier counts. Processes unified function tables. Propagates through callgraph. 500+ locals.
0x450ED0sub_450ED0propagate_register_counts16KBHIGHPropagates register/barrier counts from callees to callers. Creates EIATTR_NUM_BARRIERS.

Details: Pipeline Finalize, ELF Serialization

Write Phase (ELF Output)

AddressDecompiledProposed NameSizeConfidenceDescription
0x45C920sub_45C920write_elf_to_filesmallHIGHWrapper calling sub_45BF00 to serialize ELF to file.
0x45C950sub_45C950write_elf_to_memorysmallHIGHWrapper calling sub_45BF00 to serialize ELF to buffer.
0x45BF00sub_45BF00write_elf_to_buffer13.3KBHIGHSerializes ELF header, program headers, section headers, section data. Validates sizes.
0x45BAA0sub_45BAA0write_elf_sectionsmallHIGHWrites individual section data to output buffer at computed offset.

Details: Pipeline Output, ELF Writer


3. Input Processing

ELF Structure Management (elfw)

AddressDecompiledProposed NameSizeConfidenceDescription
0x4438F0sub_4438F0elfw_create14.8KBHIGHCreates ELF wrapper with sections: .note.nv.cuinfo, .note.nv.tkinfo, .shstrtab, .strtab, .symtab. Creates "elfw memory space" arena.
0x440740sub_440740elfw_add_section5.4KBHIGHAdds new section to ELF wrapper.
0x440BE0sub_440BE0elfw_add_section_with_data7.0KBHIGHAdds section with initial data copy.
0x441AC0sub_441AC0elfw_add_reloc_section9.5KBHIGHCreates .rela%s / .rel%s relocation sections.
0x442CA0sub_442CA0elfw_add_symbol7.2KBHIGHAdds symbol (STB_GLOBAL/WEAK/LOCAL), updates callgraph for function symbols.
0x442820sub_442820elfw_merge_symbols5.4KBHIGHMerges unified symbols including __cuda_uf_stub_ / .nv.uft stubs.
0x4411F0sub_4411F0elfw_copy_section12.2KBHIGHDeep copy of section data, symbols, relocations between elfw objects.
0x4478F0sub_4478F0elfw_dump_structure15.1KBHIGHDebug dump of ELF wrapper state: sections, symbols, relocations.
0x448E70sub_448E70elfw_section_table_build14.6KBMEDIUMRebuilds section header table, computes offsets/sizes for final layout.
0x4475B0sub_4475B0elfw_destroy3.0KBHIGHDestroys ELF wrapper and frees associated arena.

Details: ELF Parsing, Device ELF Format, NVIDIA Sections

Fatbin Extraction & Input Dispatch

AddressDecompiledProposed NameSizeConfidenceDescription
0x42AF40sub_42AF40extract_fatbin_member11.1KBHIGHExtracts object from fatbin. Dispatches by type: 1=ptx, 8=nvvm, 16=mercury/capmerc, default=cubin.
0x42A680sub_42A680register_module11.9KBHIGHRegisters linked module with module_id extracted from ELF via sub_46F0C0.
0x4876A0sub_4876A0archive_signature_check2.1KBHIGHChecks "!" and "!" signatures.
0x487C20sub_487C20archive_open2.5KBHIGHCreates archive context from buffer. Detects thin archives.
0x487E10sub_487E10archive_iterate_members5.6KBHIGHIterates archive members. Handles "__.LIBDEP", long names, thin archive resolution.
0x462620sub_462620path_split3.6KBHIGHSplits file path into directory, basename, extension.
0x42FCB0sub_42FCB0create_temp_file4.0KBHIGHCreates /tmpxft_PPPPPPPP_CCCCCCCC temporary files.

Details: Fatbin Extraction, File Type Detection, Cubin Loading

PTX Input Processing

AddressDecompiledProposed NameSizeConfidenceDescription
0x4CE8C0sub_4CE8C0ptx_version_parse_validate29KBHIGHParses .version directive, validates PTX version compatibility with target.
0x4CFBD0sub_4CFBD0ptx_obfuscation_transform27KBHIGHPTX obfuscation transformation pass. "PTX Obfuscation".

Details: PTX Input


4. Symbol Resolution & Callgraph

Symbol Resolution

AddressDecompiledProposed NameSizeConfidenceDescription
0x442CA0sub_442CA0elfw_add_symbol7.2KBHIGHAdds global symbol to ELF wrapper's symbol table. STB_GLOBAL/WEAK/LOCAL binding.
0x442820sub_442820elfw_merge_symbols5.4KBHIGHMerges unified symbols; handles __cuda_uf_stub_ and .nv.uft stubs.
0x4489C0sub_4489C0hash_table_createsmallHIGHCreates open-addressing hash table for symbol/option lookup.

Details: Symbol Resolution, Hash Tables

Callgraph & Dead Code Elimination

AddressDecompiledProposed NameSizeConfidenceDescription
0x44AD40sub_44AD40dead_code_elimination22.5KBHIGHDFS reachability on callgraph, removes unreachable functions and associated .nv.local/.nv.shared sections. Keeps address-taken functions.
0x44A5D0sub_44A5D0callgraph_detect_recursion14.4KBHIGHDFS-based recursion detection for stack size requirements.
0x44C030sub_44C030callgraph_traverse10.2KBHIGHPropagates stack sizes and register counts through call chains.
0x44CCF0sub_44CCF0callgraph_dump_dotsmallHIGHWrites Graphviz DOT format via digraph callgraph { %s -> %s; }.
0x44D200sub_44D200build_callgraph_section8.5KBHIGHGenerates .nv.callgraph section in output ELF.

Details: Dead Code Elimination

Unified Table (UDT/UFT) Management

AddressDecompiledProposed NameSizeConfidenceDescription
0x4637B0sub_4637B0uft_reorder_entries10.1KBHIGHReorders unified function/descriptor table entries. UUID-based mapping: "map uid <%llx,%llx> to key=%llx".
0x463F70sub_463F70uft_setup_sections4.0KBHIGHCreates/validates .nv.udt, .nv.uft, .nv.uft.entry, .nv.udt.entry.

Details: Unified Function Tables


5. Relocation Engine

AddressDecompiledProposed NameSizeConfidenceDescription
0x469D60sub_469D60apply_relocations26.6KBVERY HIGHComplete relocation resolution. Handles __UFT_OFFSET, __UDT_OFFSET, __UFT_CANONICAL, __UDT, __UFT. Processes .nv.resolvedrela.
0x46ADC0sub_46ADC0emit_resolved_relocations11.5KBHIGHCreates .nv.resolvedrela section when --preserve-relocs.
0x459640sub_459640reloc_vtable_create16.1KBHIGHCreates 632-byte vtable with ~70 handler slots, dispatched per arch generation (sm30..sm100+).
0x4AF3C0sub_4AF3C0hrk_section_process8.8KBHIGHProcesses .nvHRKE / .nvHRKI (Hash Relocation Key External/Internal).
0x4B02A0sub_4B02A0hrc_hrd_section_process16.3KBHIGHProcesses .nvHRCE / .nvHRCI / .nvHRDE / .nvHRDI (Hash Relocation Code/Data).

Details: R_CUDA Relocations, R_MERCURY Catalog, Bindless Relocations


6. LTO Integration

AddressDecompiledProposed NameSizeConfidenceDescription
0x4BC6F0sub_4BC6F0nvvm_compile_and_extract13.6KBVERY HIGHCalls libNVVM API: nvvmCompileProgram, nvvmGetCompiledResult, nvvmGetProgramLog, nvvmDestroyProgram. References --force-device-c.
0x4BC4A0sub_4BC4A0nvvm_api_wrapper_init2.5KBHIGHLoads libnvvm.so via dlopen, resolves nvvmCreateProgram and other API symbols via dlsym.
0x426CD0sub_426CD0lto_collect_ir_modules7.0KBMEDIUMCollects IR modules from input list for LTO compilation.
0x426AE0sub_426AE0lto_mark_used_symbols2.2KBMEDIUMMarks symbols as used for dead-code elimination with LTO. Calls sub_44AD40.
0x43FDB0sub_43FDB0thread_pool_createsmallHIGHCreates pthread thread pool for split-compile.
0x43FC80start_routinethread_worker_entrysmallVERY HIGHNamed symbol: start_routine. Thread pool worker entry point for parallel compilation tasks.
0x4264B0sub_4264B0split_compile_dispatchsmallHIGHDispatches compilation units to thread pool workers.

Details: LTO Overview, LibNVVM Integration, Split Compilation


7. Mercury / FNLZR

Finalization / JIT Pipeline

AddressDecompiledProposed NameSizeConfidenceDescription
0x4748F0sub_4748F0nvlink_link_and_finalize49KBHIGHTop-level 25-parameter entry point. Handles --binary-kind (mercury/capmerc/sass), processes compilation options, calls sub_471700. This is what nvcc/driver calls into.
0x471700sub_471700nvlink_finalize_object79KBHIGHCore finalization orchestrator. 460+ locals. Parses "deviceDebug", "lineInfo", "optLevel", "IsCompute", "IsPIC". Allocates 656-byte compilation unit descriptor. Builds compiler flags.
0x491410sub_491410compilation_unit_initialize65KBHIGHInitializes compilation unit for code generation. Copies architecture info, sets PIC flags, calls backend init via sub_A4C620.

Details: FNLZR Pipeline, Mercury Overview

MercExpand Engine

The "MercExpand" instruction expansion pass -- NVIDIA's custom ISel/lowering for Mercury (sm100+). Confirmed by string "After MercExpand" at 0x5FF15E.

AddressDecompiledProposed NameSizeConfidenceDescription
0x5FDDB0sub_5FDDB0MercExpand_Dispatch25.5KBHIGHMain entry. Switch on IR opcode type: 0=generic, 5/8/9=reg width clamp, 11=complex (shared mem / surface), 12=extended, -1=terminator. Checks attr 200==1107 for MOV special case.
0x5F38E0sub_5F38E0MercExpand_HandleInstruction35KBHIGHPer-instruction handler. Looks up 184-byte target descriptor, applies resource constraints, handles scheduling hints, 8 constraint categories.
0x5E8710sub_5E8710MercExpand_BuildFullCFGMaps54KBMEDIUMLargest MercExpand function. Builds 3 FNV-1a hash maps (offsets 632/648/664). Iterates all basic blocks.
0x5E7B90sub_5E7B90MercExpand_BuildNodeMaps24KBMEDIUMBuilds hash maps for all basic blocks with RPO arrays.
0x5EA250sub_5EA250CFG_DumpDOTGraph2KBHIGHGraphviz dump: digraph f {, bix%u, bix%d(L%x).
0x5EA4F0sub_5EA4F0MercExpand_InvalidateRegState4.3KBHIGHBumps 15+ generation counters, resets dirty flags. Maps to GPU register file partitions.
0x5FC6B0sub_5FC6B0MercExpand_ExpandMOV8.3KBMEDIUMMOV expansion. Creates target node with opcode 346, sets attribute 227=1233.
0x5FCE20sub_5FCE20MercExpand_ExpandRETURN19KBMEDIUMReturn/exit expansion. Creates nodes with opcode 270, attribute 118=519.
0x5F60E0sub_5F60E0IRTree_Walk19KBHIGHRecursive tree walker with pre/post callbacks. Manually unrolled to 5 nesting levels.
0x5F8B60sub_5F8B60MercExpand_ApplyResConstraints16KBHIGHRegister resource accounting. Switch on 52 register types (byte_1DFE340 lookup).

Details: Mercury Overview, Compiler Passes

Mercury Instruction Scheduling

AddressDecompiledProposed NameSizeConfidenceDescription
0x4A4DC0sub_4A4DC0merc_war_process24KBHIGHMercury WAR (Write-After-Read) dependency handler. "After MercWARs".
0x4A8690sub_4A8690merc_opex_expand67KBHIGHMercury operand expansion pass. "After MercOpex". Expands Mercury IR operands into final encoding form.

Details: Scheduling


8. ELF Output / Serialization

AddressDecompiledProposed NameSizeConfidenceDescription
0x4438F0sub_4438F0elfw_create14.8KBHIGHCreates ELF wrapper with initial sections. (Also listed under Input Processing.)
0x445000sub_445000finalize_elf56KBVERY HIGHFinal relocation application and ELF finalization. (Also listed under Finalize Phase.)
0x45BF00sub_45BF00write_elf_to_buffer13.3KBHIGHSerializes ELF header, program headers, section headers, section data. Validates sizes.
0x45C920sub_45C920write_elf_to_filesmallHIGHWrapper calling sub_45BF00 to serialize ELF to file.
0x45C950sub_45C950write_elf_to_memorysmallHIGHWrapper calling sub_45BF00 to serialize ELF to buffer.
0x448E70sub_448E70elfw_section_table_build14.6KBMEDIUMRebuilds section header table, computes offsets/sizes for final layout.

Details: Pipeline Output, ELF Serialization, ELF Writer


9. Infrastructure

Memory Arena / Allocator

AddressDecompiledProposed NameSizeConfidenceDescription
0x4307C0sub_4307C0arena_alloc10.7KBHIGHThread-safe (per-arena mutex). Small path: 625 free-list buckets at arena+2128 (8-byte aligned). Large path: page pool. Falls back to mmap via sub_44ED60.
0x431000sub_431000arena_free4.7KBHIGHReturns small blocks to free-list, large blocks to page pool. Checks byte_2A5BAD0 debug flag.
0x432020sub_432020arena_create_named2.2KBHIGHCreates named arena. Called with "nvlink option parser", "nvlink memory space".
0x431C70sub_431C70arena_destroy3.6KBHIGHOptionally merges free-lists back into parent arena, or frees all pages via sub_431EC0.
0x431770sub_431770arena_dump_stats8.5KBHIGHPrints detailed arena statistics: page counts, block sizes, usage.
0x4882A0sub_4882A0ocg_memspace_alloc2.5KBHIGHOCG (On-Chip-Gen) slab/segregated-freelist allocator. 128 size-class buckets, 1MB page allocations.
0x489140sub_489140memspace_statistics_print4.4KBHIGHPrints "Memory space statistics for 'OCG mem space'".
0x45CAC0sub_45CAC0oom_handlertinyHIGHOut-of-memory handler, calls abort path.
0x45CAE0sub_45CAE0arena_asserttinyHIGHArena validity assertion.
0x44F410sub_44F410arena_get_metadata<2KBHIGHLook up allocation metadata for a pointer.

Details: Memory Arenas

Diagnostics / Error Reporting

AddressDecompiledProposed NameSizeConfidenceDescription
0x467460sub_467460error_emit~2KBVERY HIGHVariadic error emission entry. First arg is always &unk_2A5Bxxx (error descriptor table entry). Dispatches to sub_467A70.
0x467A70sub_467A70diagnostic_report13.1KBHIGHFormats and emits diagnostics with severity prefixes: "warning ", "info ", "error ", "error* ", "fatal ". Location format: "%s, line %d; ". Handles suppression and warning-as-error.
0x4B9E70sub_4B9E70allocation_failure_handler5.1KBHIGH"An allocation failure occurred; heap memory may be exhausted." Also handles "Multiple errors:".
0x4BC290sub_4BC290elflink_error_handler2.5KBHIGH"elfLink: unexpected error". Error wrapper for ELF linking subsystem.

Details: Error Reporting, Elflink Errors

Threading

AddressDecompiledProposed NameSizeConfidenceDescription
0x43FDB0sub_43FDB0thread_pool_createsmallHIGHCreates pthread thread pool for split-compile.
0x43FC80start_routinethread_worker_entrysmallVERY HIGHNamed symbol. Thread pool worker entry point.
0x44F260destr_functiontls_destructorsmallVERY HIGHNamed symbol. pthread TLS destructor for arena cleanup.
0x44EF80funcatexit_cleanupsmallVERY HIGHNamed symbol. Registered via __cxa_atexit for process exit cleanup.

Details: Thread Pool

Compression (LZ4)

AddressDecompiledProposed NameSizeConfidenceDescription
0x46EE00sub_46EE00LZ4_decompress_safe_extDict81KBHIGHLZ4 decompression with external dictionary. SSE2 copy optimization.
0x46C690sub_46C690LZ4_decompress_safe20KBHIGHBasic LZ4 safe decompression (no dictionary).
0x46FD50sub_46FD50LZ4_compress13.7KBHIGHLZ4 compression with hash table match finding.

Knobs / Configuration System

AddressDecompiledProposed NameSizeConfidenceDescription
0x49B1A0sub_49B1A0knobs_file_read_parse59KBHIGHReads knobsfile, parses "[knobs]" section header, processes key=value pairs. Source: generic_knobs_impl.h.
0x49D8A0sub_49D8A0parse_knob_value24KBHIGHParses single knob value: integer, integer_range, integer_list, double, float, opcode, when-string, value_pair_list.
0x49A0C0sub_49A0C0knob_decode_and_apply14KBMEDIUMDecodes and applies knobs at pipeline stages: "After Decode", "After Expansion", "After WAR post-expansion", "After Opex".
0x498FE0sub_498FE0knob_inject_string8.7KBHIGHInjects string value into knob system. "Invalid knob specified (%s)".

GPU Architecture Profiles

AddressDecompiledProposed NameSizeConfidenceDescription
0x484F50sub_484F50arch_profile_database_init54KBVERY HIGHRegisters all GPU architectures: sm_75 (Turing) through sm_121 (DGX Spark). Creates real/virtual/lto profiles. Sets capability vectors via XMM constants. Hash map at qword_2A5F8D8. Notable: sm_88 appears (new Ampere variant). "f" variants = forward-compatible.
0x486FF0sub_486FF0architecture_parse_name2.7KBHIGHParses "sm_%2d%s", "compute_%2d%s", "sass_%2d%s" to numeric arch ID.
0x487220sub_487220architecture_name_format2.4KBMEDIUMFormats arch number back to name string.
0x4709E0sub_4709E0can_finalize_arch_check2.6KBHIGHArchitecture compatibility for finalization. Maps 104->120, 130->107, 101->110. Returns error codes 24-30.
0x470DA0sub_470DA0can_finalize_capability2.1KBHIGHFinalization capability bitmask check. Maps target codes to bitmask: 'd'(100)=1, 'g'(103)=8, 'n'(110)=2, 'y'(121)=64.

Details: Architecture Profiles, SM100 Blackwell, SM103-SM121, Compatibility


10. Debug Info / DWARF Processing

AddressDecompiledProposed NameSizeConfidenceDescription
0x47CBC0sub_47CBC0debug_line_decode_replay33KBHIGHDWARF .debug_line decoder/replayer. Initializes state machine, reads include directories and file tables.
0x478A20sub_478A20debug_line_info_encode28KBHIGHDWARF .debug_line header encoder: version, prologue_length, opcode_base, include_directories[], file_names[].
0x4783C0sub_4783C0debug_line_program_serialize13KBHIGHSerializes DWARF line number program opcodes from individual CUs into combined section.
0x480FB0sub_480FB0debug_line_merge25KBHIGHMerges line number tables across compilation units using BST and "%llu_%llu_%llu" keys.
0x482850sub_482850debug_info_complex_merge36KBMEDIUMFull debug info section merge across CUs.
0x404827sub_404827debug_line_info_builder4.3KBHIGHGenerates DWARF line info for inline functions: "%s+%llu", ".L__$locationLabel$__%d".
0x4707D0sub_4707D0debug_info_set_prefix_suffixsmallHIGHSets prefix/suffix strings for debug info section naming.

Details: DWARF Processing, Line Tables, NVIDIA Debug Extensions


11. Embedded ptxas Backend (Compiler Functions)

Note: The following sections document functions from the embedded ptxas compiler backend, which constitutes ~95% of the nvlink binary by code size. These are the same compiler passes found in the standalone ptxas binary; see the ptxas wiki for comprehensive documentation of the full 40,000-function compiler. This section covers only the most prominent functions visible from the linker's perspective.

IR Node Primitives

The fundamental API for accessing IR instruction fields. sub_530FB0 alone has 31,399 callers.

AddressDecompiledProposed NameSizeTagDescription
0x530FB0sub_530FB0IRNode_GetOperand16B--return *(a1+32) + 32 * index (operand array, 32-byte stride)
0x530FC0sub_530FC0IRNode_GetNumSrcOperands16B--total_ops + 1 - first_src_index
0x530FD0sub_530FD0IRNode_GetNumDstOperands16B--return *(a1 + 92)
0x530E80sub_530E80IRNode_GetRegClass16B--Identity extract (unsigned int)
0x530F80sub_530F80IRNode_GetDataType16B--Identity extract for data type field
0x530E90sub_530E90IROperand_IsRegister16Btag=2return type == 2
0x530EA0sub_530EA0IROperand_IsImmediate16Btag=1return type == 1
0x530EB0sub_530EB0IROperand_IsMemRef16Btag=6return type == 6
0x530EC0sub_530EC0IROperand_IsAddress16Btag=10return type == 10
0x530ED0sub_530ED0IROperand_IsPredicate16Btag=9return type == 9
0x530EE0sub_530EE0IROperand_IsCondCode16Btag=5return type == 5
0x530EF0sub_530EF0IROperand_IsConstant16Btag=4return type == 4
0x530F00sub_530F00IROperand_IsSymbol16Btag=3return type == 3
0x530F50sub_530F50IROperand_IsBarrier16Btag=7return type == 7
0x530F90sub_530F90IRNode_SetFlagA16B--*(a1 + 14) = a2
0x530FA0sub_530FA0IRNode_SetFlagB16B--*(a1 + 15) = a2

Details: IR Nodes

NVInst Class Hierarchy (Instruction Representation)

AddressDecompiledProposed NameSizeConfidenceDescription
0xA49150sub_A49150NVInst_getOperandField60BVERY HIGH30,768 callers. Calls sub_A7DE70 (hasOperand), then sub_A709F0 (getValue). Returns -1 if field absent.
0xA49120sub_A49120NVInst_setOperandField16BHIGHThunk to sub_A5B6B0 (180KB switch dispatch).
0xA491D0sub_A491D0NVInst_setOperandImm16BHIGHThunk to sub_A62220 (65KB switch dispatch).
0xA491E0sub_A491E0NVInst_getOperandFieldSlot16BHIGHThunk to sub_A65900 (67KB switch dispatch).
0xA49130sub_A49130NVInst_getDefaultOperandValue16BHIGHThunk to sub_A67910 (141KB switch dispatch).
0xA49190sub_A49190NVInst_hasOperandField16BHIGHDirect wrapper for sub_A7DE70.
0xA491A0sub_A491A0NVInst_copyOperandField48BHIGHGets from src via sub_A709F0, sets on dst via sub_A5B6B0.
0xA49220sub_A49220NVInst_lookupOpcodeDesc96BHIGHFNV-1a hash lookup in opcode descriptor table.
0xA4AB10sub_A4AB10NVInst_constructor11KBHIGHInitializes NVInst object with operand vector, hash tables, scheduling info.

Operand Dispatch Mega-Functions

Four giant switch-case functions that implement the complete operand field encoding/decoding dispatch. Each switches on opcode class ID (370+ classes).

AddressDecompiledProposed NameSizeConfidenceDescription
0xA5B6B0sub_A5B6B0setOperandField_dispatch180KBHIGHSets operand field value on instruction. Switch on opcode class (0x00-0x171).
0xA62220sub_A62220setOperandImm_dispatch65KBHIGHSets immediate operand value. Same switch structure.
0xA65900sub_A65900getOperandField_dispatch67KBHIGHGets operand field value for specific slot.
0xA67910sub_A67910getDefaultOperandValue_dispatch141KBHIGHReturns default value for an operand field.
0xA709F0sub_A709F0InstrFieldOffset_Query~180KBHIGH6,491-line switch mapping (opcode_class, field_id) to bit-offset in instruction encoding. Returns -1 if absent.
0xA7DE70sub_A7DE70InstrFieldPresent_Query~170KBHIGHSame switch structure; returns (extract != 0). Companion to sub_A709F0.

ISel Pattern Matching

SM50-SM7x ISel Hub (Maxwell/Pascal/Volta)

AddressRangeCountConfidenceDescription
0x530FE00x530FE0--0x5B1AB01,293HIGHAuto-generated pattern matchers. Signature: (ctx, node, &opcode, &priority). Check attributes via sub_A49150, operand types/counts, output (target_opcode, priority). 152 distinct opcodes, 36 priority levels.

SM75 ISel Hub (Turing)

AddressRangeCountConfidenceDescription
0xF161500xF16150--0xFBB780276HIGHSM75 pattern matchers. Same signature. Calls sub_A49150 for attributes, sub_530FD0/sub_530FB0/sub_530FC0 for operand queries.
0xFBB810--280KBHIGHSM75 ISel mega-hub dispatch. Calls all 276 matchers, selects highest priority, dispatches to corresponding emitter. Too large to decompile.

SM80 ISel Hub (Ampere)

AddressRangeCountConfidenceDescription
0xCE20000xCE2000--0xD60000259HIGHSM80 pattern matchers. 19 distinct instruction opcodes (HMMA, IMAD, FFMA, LDG, S2R, etc.).

SM100+ ISel (Blackwell)

Blackwell ISel patterns are distributed across the encoding/decoding table regions. The dispatch tables at 0xE43C20 and 0xEFE6C0 use binary search on opcode fields to route to the correct encoder/decoder.

Instruction Encoding Infrastructure

AddressDecompiledProposed NameSizeConfidenceDescription
0x4C28B0sub_4C28B0setBitfieldsmallVERY HIGHsetBitfield(buf, bit_offset, width, value). Core bitfield insertion into 128-bit instruction word at buf+544.
0x4C2A60sub_4C2A60encoding_initsmallHIGHClears operand remapping table (offsets 468-531), resets operand counter at 532.
0x4C2A90sub_4C2A90encode_predicatesmallHIGHEncodes predicate guard register from IR node.
0x4C4D60sub_4C4D60encode_register_operandsmallHIGHEncodes register operand: 1-bit is_output, 4-bit type, 10-bit register number.
0x4C52F0sub_4C52F0encode_immediate_operandsmallHIGHEncodes constant/immediate operand: 5-bit type + register number.
0x4C5C30sub_4C5C30encode_special_operandsmallHIGHEncodes predicate/condcode/memory operands with remapping.
0x4C7D10sub_4C7D10encoding_engine_main18.6KBHIGHMain encoding engine. String: "ENCODING". Converts IR to binary.
0x4CB100sub_4CB100decoding_engine_entry3.4KBHIGHEntry point for instruction decoding. String: "DECODING".

Per-Architecture Encoding Tables

Address RangeTargetEncoder CountDescription
0x603F60--0x61FA60SM507964-bit instruction words (Maxwell). Format types 1/2/3.
0x620000--0x84DD70SM100+1,537128-bit Blackwell SASS. Major opcodes 1/2/8.
0xA87CE0--0xB25D50SM90164128-bit Hopper encoding.
0xB9FDE0--0xC9EE60SM7x-SM89~270Multi-arch encoders: SM70/75/80/86/89.
0xDA0310--0xE436D0SM100+438Blackwell encoders (second set).

Per-Architecture Decoding Tables

Address RangeTargetDecoder CountDescription
0x84DD70--0xA48290SM100+1,613Instruction descriptor init functions.
0xACECF0--0xB77B60SM90139Hopper decoders.
0xE43DC0--0xF15A50SM100+648Blackwell decoders.

Bitvector Operations (SSE-Optimized)

Used by register allocation and liveness analysis throughout the backend.

AddressDecompiledProposed NameSizeConfidence
0x5E4470sub_5E4470BitVector_AND3.2KBHIGH
0x5E4670sub_5E4670BitVector_OR2.9KBHIGH
0x5E4810sub_5E4810BitVector_ANDNOT4.4KBHIGH
0x5E4AE0sub_5E4AE0BitVector_XOR2.6KBMEDIUM
0x5E51C0sub_5E51C0BitVector_OR_Changed2.9KBMEDIUM
0x5E55E0sub_5E55E0BitVector_PopCount5.4KBMEDIUM
0x5E5940sub_5E5940BitVector_FindFirst3.0KBMEDIUM

Peephole Optimization

AddressDecompiledProposed NameSizeConfidenceDescription
0x406DC0sub_406DC0peephole_optimizer_main6.8KBMEDIUMMain driver -- orchestrates multiple optimization passes on instruction buffer.
0x407634sub_407634peephole_instruction_combine5.3KBMEDIUMCombines dependent instruction pairs. 372-byte records, limit 20479.
0x406377sub_406377peephole_pattern_match7.4KBMEDIUMMatches and transforms instruction patterns.
0x408594sub_408594peephole_scheduler6.5KBLOWInstruction scheduling within basic blocks.
0x407F94sub_407F94peephole_constant_fold3.7KBLOWConstant propagation in instructions.
0x407C0Asub_407C0Apeephole_strength_reduce3.2KBLOWStrength reduction (replace expensive ops with cheaper ones).
0x4083A5sub_4083A5peephole_dead_instruction_elim2.9KBLOWRemoves dead instructions using liveness.

Details: Peephole

PTX Assembler Frontend (Embedded ptxas)

Large PTX processing subsystem in the 0x1430000--0x15C0000 range.

AddressDecompiledProposed NameSizeConfidenceDescription
0x15B86A0sub_15B86A0cuda_builtin_prototype_gen345KBHIGHGiant switch (~608 cases) generating PTX prototype strings for CUDA builtins: div, rem, rcp, sqrt, wmma, shfl, vote, tcgen05, bulk_copy, etc.
0x147EF50sub_147EF50ptx_instr_semantic_analyzer288KBHIGHMaster instruction validator. SM version gates, texture modes, cache policies, state spaces, vector types, scoping.
0x1487650sub_1487650ptx_statement_processor240KBMEDIUMTop-level PTX statement handler. Processes .maxnctapersm, .reqntid, kernel parameter limits (4352 bytes), function prototypes.
0x146BEC0sub_146BEC0ptx_load_store_validator206KBHIGHMemory operation validator. Validates ld/st, atomics, reductions, fence, membar, cp.async, cache eviction, scope.

Details: PTX Parsing, Embedded ptxas Overview


Statistics

MetricValue
Binary file size~37 MB
.text section size25.2 MB
Total functions (IDA)40,532
Linker core functions (0x400000--0x4C0000)~1,900
Embedded ptxas backend functions~38,000
Functions documented on this page~160 key functions
ISel pattern matchers (all arches)~2,100+
Instruction encoders (all arches)~2,500+
Instruction descriptor inits~1,600+
Instruction decoders (all arches)~800+
Subsystems identified20
Largest functionsub_15B86A0 cuda_builtin_prototype_gen (345KB)
Most-called functionsub_530FB0 IRNode_GetOperand (31,399 callers)
Binary composition~5% linker, ~95% embedded compiler backend

Address Map (sorted)

Quick reference sorted by address for binary navigation. All addresses verified against decompiled files in nvlink/decompiled/.

0x404827  sub_404827  debug_line_info_builder
0x406377  sub_406377  peephole_pattern_match
0x406DC0  sub_406DC0  peephole_optimizer_main
0x407634  sub_407634  peephole_instruction_combine
0x407C0A  sub_407C0A  peephole_strength_reduce
0x407F94  sub_407F94  peephole_constant_fold
0x4083A5  sub_4083A5  peephole_dead_instruction_elim
0x408594  sub_408594  peephole_scheduler
0x409800  main        main
0x40C4F0  ctor_001    ctor_thread_infra
0x40C5C0  ctor_002    ctor_002
0x410830  ctor_003    ctor_003
0x410850  ctor_004    ctor_004
0x412750  ctor_005    ctor_knob_table
0x412790  ctor_006    ctor_006
0x426260  ctor_008    ctor_version_constants
0x426280  ctor_009    ctor_009
0x4262B0  ctor_010    ctor_010
0x426330  ctor_011    ctor_011
0x4264B0  sub_4264B0  split_compile_dispatch
0x426570  sub_426570  validate_arch_and_merge
0x426AE0  sub_426AE0  lto_mark_used_symbols
0x426CD0  sub_426CD0  lto_collect_ir_modules
0x4275C0  sub_4275C0  post_link_transform (FNLZR)
0x4279C0  sub_4279C0  trace_phase
0x427AE0  sub_427AE0  nvlink_parse_options
0x42A680  sub_42A680  register_module
0x42AF40  sub_42AF40  extract_fatbin_member
0x42DBC0  sub_42DBC0  option_validate_value
0x42DFE0  sub_42DFE0  option_parser_create
0x42E390  sub_42E390  option_get_value
0x42E5A0  sub_42E5A0  option_parse_argv
0x42F130  sub_42F130  option_register
0x42FCB0  sub_42FCB0  create_temp_file
0x4307C0  sub_4307C0  arena_alloc
0x431000  sub_431000  arena_free
0x431770  sub_431770  arena_dump_stats
0x431C70  sub_431C70  arena_destroy
0x432020  sub_432020  arena_create_named
0x432B10  sub_432B10  merge_overlapping_global
0x4339A0  sub_4339A0  optimize_constant_dedup
0x4343C0  sub_4343C0  merge_overlapping_constant
0x436BD0  sub_436BD0  shared_memory_optimizer
0x437E20  sub_437E20  merge_overlapping_local
0x438640  sub_438640  merge_constant_bank_data
0x438DD0  sub_438DD0  process_bindless_references
0x439830  sub_439830  shared_memory_layout
0x43FC80  start_routine  thread_worker_entry
0x43FDB0  sub_43FDB0  thread_pool_create
0x440740  sub_440740  elfw_add_section
0x440BE0  sub_440BE0  elfw_add_section_with_data
0x4411F0  sub_4411F0  elfw_copy_section
0x441AC0  sub_441AC0  elfw_add_reloc_section
0x4438F0  sub_4438F0  elfw_create
0x442820  sub_442820  elfw_merge_symbols
0x442CA0  sub_442CA0  elfw_add_symbol
0x445000  sub_445000  finalize_elf
0x4475B0  sub_4475B0  elfw_destroy
0x4478F0  sub_4478F0  elfw_dump_structure
0x448360  sub_448360  elfw_get_section_header
0x4489C0  sub_4489C0  hash_table_create
0x448E70  sub_448E70  elfw_section_table_build
0x44A5D0  sub_44A5D0  callgraph_detect_recursion
0x44AD40  sub_44AD40  dead_code_elimination
0x44C030  sub_44C030  callgraph_traverse
0x44CCF0  sub_44CCF0  callgraph_dump_dot
0x44D200  sub_44D200  build_callgraph_section
0x44EF80  func        atexit_cleanup
0x44F260  destr_function  tls_destructor
0x44F410  sub_44F410  arena_get_metadata
0x450ED0  sub_450ED0  propagate_register_counts
0x451D80  sub_451D80  compute_entry_properties
0x459640  sub_459640  reloc_vtable_create
0x45BAA0  sub_45BAA0  write_elf_section
0x45BF00  sub_45BF00  write_elf_to_buffer
0x45C920  sub_45C920  write_elf_to_file
0x45C950  sub_45C950  write_elf_to_memory
0x45CAC0  sub_45CAC0  oom_handler
0x45CAE0  sub_45CAE0  arena_assert
0x45CCD0  sub_45CCD0  timing_start
0x45D180  sub_45D180  merge_weak_function
0x45E7D0  sub_45E7D0  merge_elf
0x462620  sub_462620  path_split
0x4637B0  sub_4637B0  uft_reorder_entries
0x463F70  sub_463F70  uft_setup_sections
0x464460  sub_464460  linked_list_append
0x467460  sub_467460  error_emit
0x467A70  sub_467A70  diagnostic_report
0x469D60  sub_469D60  apply_relocations
0x46ADC0  sub_46ADC0  emit_resolved_relocations
0x46C690  sub_46C690  LZ4_decompress_safe
0x46EE00  sub_46EE00  LZ4_decompress_safe_extDict
0x46FD50  sub_46FD50  LZ4_compress
0x4707D0  sub_4707D0  debug_info_set_prefix_suffix
0x4709E0  sub_4709E0  can_finalize_arch_check
0x470DA0  sub_470DA0  can_finalize_capability
0x471700  sub_471700  nvlink_finalize_object
0x4748F0  sub_4748F0  nvlink_link_and_finalize
0x4783C0  sub_4783C0  debug_line_program_serialize
0x478A20  sub_478A20  debug_line_info_encode
0x47CBC0  sub_47CBC0  debug_line_decode_replay
0x480FB0  sub_480FB0  debug_line_merge
0x482850  sub_482850  debug_info_complex_merge
0x484F50  sub_484F50  arch_profile_database_init
0x486FF0  sub_486FF0  architecture_parse_name
0x487220  sub_487220  architecture_name_format
0x4876A0  sub_4876A0  archive_signature_check
0x487C20  sub_487C20  archive_open
0x487E10  sub_487E10  archive_iterate_members
0x4882A0  sub_4882A0  ocg_memspace_alloc
0x489140  sub_489140  memspace_statistics_print
0x491410  sub_491410  compilation_unit_initialize
0x498FE0  sub_498FE0  knob_inject_string
0x49A0C0  sub_49A0C0  knob_decode_and_apply
0x49B1A0  sub_49B1A0  knobs_file_read_parse
0x49D8A0  sub_49D8A0  parse_knob_value
0x4A4DC0  sub_4A4DC0  merc_war_process
0x4A8690  sub_4A8690  merc_opex_expand
0x4AF3C0  sub_4AF3C0  hrk_section_process
0x4B02A0  sub_4B02A0  hrc_hrd_section_process
0x4B9E70  sub_4B9E70  allocation_failure_handler
0x4BC290  sub_4BC290  elflink_error_handler
0x4BC4A0  sub_4BC4A0  nvvm_api_wrapper_init
0x4BC6F0  sub_4BC6F0  nvvm_compile_and_extract
0x4C28B0  sub_4C28B0  setBitfield
0x4C2A60  sub_4C2A60  encoding_init
0x4C2A90  sub_4C2A90  encode_predicate
0x4C4D60  sub_4C4D60  encode_register_operand
0x4C52F0  sub_4C52F0  encode_immediate_operand
0x4C5C30  sub_4C5C30  encode_special_operand
0x4C7D10  sub_4C7D10  encoding_engine_main
0x4CB100  sub_4CB100  decoding_engine_entry
0x4CE8C0  sub_4CE8C0  ptx_version_parse_validate
0x4CFBD0  sub_4CFBD0  ptx_obfuscation_transform
0x50C790  sub_50C790  getReuse
0x530E80  sub_530E80  IRNode_GetRegClass
0x530E90  sub_530E90  IROperand_IsRegister
0x530EA0  sub_530EA0  IROperand_IsImmediate
0x530EB0  sub_530EB0  IROperand_IsMemRef
0x530EC0  sub_530EC0  IROperand_IsAddress
0x530ED0  sub_530ED0  IROperand_IsPredicate
0x530EE0  sub_530EE0  IROperand_IsCondCode
0x530EF0  sub_530EF0  IROperand_IsConstant
0x530F00  sub_530F00  IROperand_IsSymbol
0x530F50  sub_530F50  IROperand_IsBarrier
0x530F80  sub_530F80  IRNode_GetDataType
0x530F90  sub_530F90  IRNode_SetFlagA
0x530FA0  sub_530FA0  IRNode_SetFlagB
0x530FB0  sub_530FB0  IRNode_GetOperand
0x530FC0  sub_530FC0  IRNode_GetNumSrcOperands
0x530FD0  sub_530FD0  IRNode_GetNumDstOperands
0x5E4470  sub_5E4470  BitVector_AND
0x5E4670  sub_5E4670  BitVector_OR
0x5E4810  sub_5E4810  BitVector_ANDNOT
0x5E4AE0  sub_5E4AE0  BitVector_XOR
0x5E51C0  sub_5E51C0  BitVector_OR_Changed
0x5E55E0  sub_5E55E0  BitVector_PopCount
0x5E5940  sub_5E5940  BitVector_FindFirst
0x5E7B90  sub_5E7B90  MercExpand_BuildNodeMaps
0x5E8710  sub_5E8710  MercExpand_BuildFullCFGMaps
0x5EA250  sub_5EA250  CFG_DumpDOTGraph
0x5EA4F0  sub_5EA4F0  MercExpand_InvalidateRegState
0x5F38E0  sub_5F38E0  MercExpand_HandleInstruction
0x5F60E0  sub_5F60E0  IRTree_Walk
0x5F8B60  sub_5F8B60  MercExpand_ApplyResConstraints
0x5FC6B0  sub_5FC6B0  MercExpand_ExpandMOV
0x5FCE20  sub_5FCE20  MercExpand_ExpandRETURN
0x5FDDB0  sub_5FDDB0  MercExpand_Dispatch
0xA49120  sub_A49120  NVInst_setOperandField
0xA49130  sub_A49130  NVInst_getDefaultOperandValue
0xA49150  sub_A49150  NVInst_getOperandField
0xA49190  sub_A49190  NVInst_hasOperandField
0xA491A0  sub_A491A0  NVInst_copyOperandField
0xA491D0  sub_A491D0  NVInst_setOperandImm
0xA491E0  sub_A491E0  NVInst_getOperandFieldSlot
0xA49220  sub_A49220  NVInst_lookupOpcodeDesc
0xA4AB10  sub_A4AB10  NVInst_constructor
0xA50D10  sub_A50D10  encode_GPR
0xA5B6B0  sub_A5B6B0  setOperandField_dispatch
0xA62220  sub_A62220  setOperandImm_dispatch
0xA65900  sub_A65900  getOperandField_dispatch
0xA67910  sub_A67910  getDefaultOperandValue_dispatch
0xA709F0  sub_A709F0  InstrFieldOffset_Query
0xA7DE70  sub_A7DE70  InstrFieldPresent_Query
0x146BEC0 sub_146BEC0 ptx_load_store_validator
0x147EF50 sub_147EF50 ptx_instr_semantic_analyzer
0x1487650 sub_1487650 ptx_statement_processor
0x15B86A0 sub_15B86A0 cuda_builtin_prototype_gen