Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

nvcc-to-cicc Interface Contract

When nvcc compiles device code, it invokes cicc as an external process, passing the preprocessed CUDA source (or LLVM bitcode) along with a carefully translated set of flags. cicc never sees the raw -fmad=1 or -prec_sqrt=0 flags that the user typed on the nvcc command line -- those are rewritten through a flag translation table implemented as a global std::map red-black tree at sub_8FE280. This page documents the complete interface contract: how nvcc invokes cicc, how flags are translated, how the mode cookie selects CUDA vs. OpenCL behavior, what input formats are accepted, and what output modes are available.

The flag translation is split into two stages. Stage 1 (sub_8FE280) translates nvcc-facing flags into cicc-facing flags, producing a dual-slot result with an EDG front-end flag and an internal cicc flag. Stage 2 (sub_95EB40) further expands each cicc-facing flag into a three-column architecture mapping, routing each flag to the EDG frontend, the NVVM optimizer, and the LLC backend. The composition of these two stages means a single nvcc flag like -fmad=1 can silently become --emit-llvm-bc (always injected), nothing to EDG, nothing to OPT, and -nvptx-fma-level=1 to LLC.

Flag translation treesub_8FE280 -- global std::map at qword_4F6D2A0, 40+ entries
Tree guardqword_4F6D2C8 (set to 1 after first initialization)
Tree node size72+ bytes: key at +32, length at +40, FlagPair* at +64
CLI parser (Path A)sub_900130 (39 KB, 12 parameters)
Flag catalog (Path A/B)sub_9624D0 (75 KB, 2,626 lines, 4 output vectors)
3-column arch tablesub_95EB40 (38 KB, 23 architectures, 3-column fan-out)
Mode cookies0xABBA = CUDA, 0xDEED = OpenCL
Default architecturecompute_75 / sm_75 (Turing)
Input extensions.bc, .ci, .i, .cup, .optixir, .ii
Default opt level-opt=3 (O3)

Invocation Contract

nvcc invokes cicc as a subprocess with a single input file and a set of translated flags. The general invocation form is:

cicc [mode-flags] [translated-flags] [pass-through-flags] -o <output> <input>

For the standard CUDA compilation path (no explicit -lXXX mode flag), cicc enters sub_8F9C90 (real main, 10,066 bytes at 0x8F9C90), parses all arguments into ~12 local variables, resolves the Path A / Path B dispatch variable v253, and calls one of:

  • Path A (EDG pipeline): sub_902D10 -- invokes sub_900130 for CLI parsing, then the EDG frontend via sub_905880, then the LibNVVM pipeline via sub_905EE0.
  • Path B (standalone LLVM pipeline): sub_1262860 -- similar flow but through standalone LLVM infrastructure at 0x1262860.

Path selection is controlled by v253, which defaults to 2 (unresolved) and is resolved through the obfuscated environment variable NV_NVVM_VERSION. For SM >= 100 (Blackwell and later), the default is Path B unless the -nvc flag is present. For SM < 100, the default is Path A. See Entry Point for the full dispatch matrix.

When cicc is invoked in multi-stage mode (-lnk, -opt, -llc, -libnvvm), the entry point dispatches to sub_905EE0 (Path A, 43 KB) or sub_1265970 (Path B, 48 KB), which orchestrate the LNK, OPT, and LLC sub-pipelines internally.

Parameter Passing to sub_900130

The Path A CLI parser sub_900130 receives 12 parameters and performs a two-pass argument scan:

unsigned int sub_900130(
    const char *input_file,    // a1: input filename
    const char *opencl_src,    // a2: OpenCL source path (NULL for CUDA)
    const char *output_file,   // a3: output filename
    __int64    *arg_vector,    // a4: pointer to std::vector<std::string>
    char        mode_flag,     // a5: mode flag (0=normal, 1=special)
    __int64     job_desc,      // a6: output compilation job struct
    __int64     error_out,     // a7: error string output
    _BYTE      *m64_flag,     // a8: output - set to 1 if -m64 seen
    _BYTE      *discard_names, // a9: output - set to 1 if -discard-value-names
    __int64     trace_path,    // a10: device time trace path
    __int64     trace_pid,     // a11: trace PID
    __int64     trace_env      // a12: trace env value
);
// Returns: 0 = success, 1 = error

Pass 1: Scans for -arch flag via sub_8FD0D0, extracts architecture string.

Pass 2: Iterates all arguments, looking each up in the red-black tree at qword_4F6D2A0. For tree hits, the EDG slot is pushed to the EDG argument vector (v145) and the cicc slot is pushed to the backend argument vector (v148). For tree misses, sequential string comparisons handle extended flags (-maxreg=N, -split-compile=N, --Xlgenfe, --Xlibnvvm, etc.).

Before any user flags, sub_900130 unconditionally injects:

  • --emit-llvm-bc into the EDG argument vector
  • --emit-nvvm-latest into the backend argument vector

After all arguments are processed, architecture strings are appended:

  • --nv_arch + sm_XX to EDG arguments
  • -arch=compute_XX to backend arguments

Mode Cookies

The sub_9624D0 flag catalog function takes a fourth parameter a4 that selects the language mode. This is not a user-visible flag -- it is passed internally by the pipeline orchestrator.

CookieHexDecimalLanguage
0xABBA0xABBA43,962CUDA compilation
0xDEED0xDEED57,069OpenCL compilation

The cookie affects multiple behaviors:

Precision division routing. In CUDA mode (0xABBA), -prec-div=0 maps to -nvptx-prec-divf32=1 (not 0) at LLC, while -prec-div=1 maps to -nvptx-prec-divf32=2. In OpenCL mode (0xDEED), the mapping is straightforward: -prec-div=0 maps to -nvptx-prec-divf32=0, -prec-div=1 to -nvptx-prec-divf32=1, and OpenCL additionally supports -prec-div=2 mapping to -nvptx-prec-divf32=3.

Fast-math routing. In CUDA mode, -fast-math maps to -R __CUDA_USE_FAST_MATH=1 for EDG and -opt-use-fast-math for OPT, with no LLC flag. In OpenCL mode, -fast-math maps to -R FAST_RELAXED_MATH=1 -R __CUDA_FTZ=1 for EDG and -opt-use-fast-math -nvptx-f32ftz for OPT.

Default precision. -prec-sqrt defaults to 1 (precise) in CUDA mode, 0 (imprecise) in OpenCL mode.

Discard value names. In CUDA mode (0xABBA), without explicit override, value names are discarded by default (a1+232 = 1), generating -lnk-discard-value-names=1, -opt-discard-value-names=1, and -lto-discard-value-names=1. In OpenCL mode (0xDEED), this only applies when (a13 & 0x20) is set (LTO generation active).

OptiX IR emission. The --emit-optix-ir flag is only valid when the cookie is 0xABBA or 0xDEED.

Internal compile call. The LibNVVM compile function nvvmCUCompile (dispatch ID 0xBEAD) is called with phase code 57,069 (0xDEED) regardless of the outer cookie -- this is the internal LibNVVM compile phase code, not a language selector.

Flag Translation Table

sub_8FE280 populates a global std::map<std::string, FlagPair*> in the red-black tree at qword_4F6D2A0. Each FlagPair is a 16-byte struct with two slots: slot 0 for the EDG frontend passthrough, slot 1 for the internal cicc flag. The function is called exactly once, guarded by qword_4F6D2C8.

Red-Black Tree Structure

qword_4F6D2A0  -- tree root pointer (std::_Rb_tree)
dword_4F6D2A8  -- sentinel node (tree.end())
qword_4F6D2B0  -- root node pointer
qword_4F6D2B8  -- begin iterator (leftmost node)
qword_4F6D2C8  -- initialization guard (1 = already built)

Each node is 72+ bytes:

OffsetField
+0Color (0=red, 1=black)
+8Parent pointer
+16Left child pointer
+24Right child pointer
+32Key data pointer (std::string internals)
+40Key length
+48Key capacity
+64Value pointer (FlagPair*)

Lookup is via sub_8FE150 (lower_bound + insert-if-not-found). Insert is via sub_8FDFD0 (allocate node + rebalance). Comparison uses standard std::string::compare.

Complete nvcc-to-cicc Mapping

The table below shows every entry in the sub_8FE280 red-black tree. Slot 0 is forwarded to the EDG frontend; slot 1 is forwarded to the cicc backend pipeline. <null> means no flag is generated for that slot.

nvcc flagEDG passthrough (slot 0)cicc internal (slot 1)Notes
-m32--m32<null>
-m64--m64<null>Also sets *a8 = 1
-fast-math<null>-fast-math
-ftz=1<null>-ftz=1
-ftz=0<null>-ftz=0
-prec_sqrt=1<null>-prec-sqrt=1Underscore to hyphen
-prec_sqrt=0<null>-prec-sqrt=0Underscore to hyphen
-prec_div=1<null>-prec-div=1Underscore to hyphen
-prec_div=0<null>-prec-div=0Underscore to hyphen
-fmad=1<null>-fma=1fmad renamed to fma
-fmad=0<null>-fma=0fmad renamed to fma
-O0--device-O=0-opt=0Dual-mapped
-O1--device-O=1-opt=1Dual-mapped
-O2--device-O=2-opt=2Dual-mapped
-O3--device-O=3-opt=3Dual-mapped
-Osize<null>-Osize
-Om<null>-Om
-Ofast-compile=max<null>-Ofast-compile=max
-Ofc=max<null>-Ofast-compile=maxAlias
-Ofast-compile=mid<null>-Ofast-compile=mid
-Ofc=mid<null>-Ofast-compile=midAlias
-Ofast-compile=min<null>-Ofast-compile=min
-Ofc=min<null>-Ofast-compile=minAlias
-Ofast-compile=0<null><null>No-op
-Ofc=0<null><null>No-op alias
-g--device-debug-gDual-mapped
-show-src<null>-show-src
-disable-allopts<null>-disable-allopts
-disable-llc-opts<null>disable-llc-opts
-w-w-wDual-mapped
-Wno-memory-space<null>-Wno-memory-space
-disable-inlining<null>-disable-inlining
-aggressive-inline<null>-aggressive-inline
--kernel-params-are-restrict--kernel-params-are-restrict-restrictDual-mapped, renamed
-allow-restrict-in-struct<null>-allow-restrict-in-struct
--device-c--device-c--device-cDual-mapped
--generate-line-info--generate-line-info-generate-line-infoDual-mapped
--enable-opt-byval--enable-opt-byval-enable-opt-byvalDual-mapped
--no-lineinfo-inlined-at<null>-no-lineinfo-inlined-at
--keep-device-functions--keep-device-functions<null>EDG only
--emit-optix-ir--emit-lifetime-intrinsics--emit-optix-irTriggers lifetime intrinsics in EDG
-opt-fdiv=0<null>-opt-fdiv=0
-opt-fdiv=1<null>-opt-fdiv=1
-new-nvvm-remat<null>-new-nvvm-remat
-disable-new-nvvm-remat<null>-disable-new-nvvm-remat
-disable-nvvm-remat<null>-disable-nvvm-remat
-discard-value-names--discard_value_names=1-discard-value-names=1Also sets *a9 = 1
-gen-opt-lto<null>-gen-opt-lto

Key translation patterns:

  • Underscore to hyphen: nvcc uses underscores (-prec_sqrt), cicc uses hyphens (-prec-sqrt).
  • Rename: -fmad becomes -fma internally.
  • Dual-mapping: -O0 through -O3 emit both an EDG flag (--device-O=N) and a cicc flag (-opt=N).
  • Alias expansion: -Ofc=X is silently rewritten to -Ofast-compile=X.
  • Implicit dependency: --emit-optix-ir adds --emit-lifetime-intrinsics to the EDG frontend, enabling lifetime intrinsic generation that the OptiX IR output path requires.

Extended Flags (Not in Tree)

The following flags are handled by sequential string comparison in sub_900130 when a tree lookup misses:

nvcc flagExpansionNotes
-maxreg=N-maxreg=<N> to backend
-split-compile=N-split-compile=<N> to OPTError if specified twice
-split-compile-extended=N-split-compile-extended=<N> to OPTMutually exclusive with -split-compile
--Xlgenfe <arg><arg> to EDG
--Xlibnvvm <arg><arg> to backend
--Xlnk <arg> / -Xlnk <arg>-Xlnk + <arg> to backend
--Xopt <arg> / -Xopt <arg>-Xopt + <arg> to backend
--Xllc <arg> / -Xllc <arg>-Xllc + <arg> to backend
-Xlto <arg><arg> to LTO vector
-covinfo <file>-Xopt -coverage=true -Xopt -covinfofile=<file>
-profinfo <file>-Xopt -profgen=true -Xopt -profinfofile=<file>
-profile-instr-use <file>-Xopt -profuse=true -Xopt -proffile=<file>
-lto-gen-lto to backend; enables LTO
-olto <file>-gen-lto-and-llc + flag + next arg
--promote_warnings-Werror to backend; flag to EDG
-inline-info-Xopt -pass-remarks=inline + missed + analysis
-jump-table-density=N-jump-table-density=<N> to backend
-opt-passes=<val>-opt-passes=<val> to backend
--orig_src_file_name <val>--orig_src_file_name + <val> to EDG
--force-llp64Pass to EDG; sets byte_4F6D2DC = 1
--partial-linkComplex: may add -memdep-cache-byval-loads=false to OPT and LLCSets byte_4F6D2D0 = 1
--tile-onlyPass to EDG + --tile_bc_file_name + output path
--device-time-tracePass to EDG; next arg becomes trace path
-jobserver-jobserver to backend or pass to EDG

Input Extensions

Input files are identified by extension during the argument loop in sub_8F9C90. The last matching file wins (the input variable s is overwritten each time). Extension matching proceeds by checking trailing characters: last 3 for .bc/.ci, last 2 for .i, last 3 for .ii, last 4 for .cup, last 8 for .optixir.

ExtensionFormatConditionAddress
.bcLLVM bitcodeAlways accepted0x8FAA0A
.ciCUDA intermediate (preprocessed)Always accepted0x8FAA29
.iPreprocessed C/C++Always accepted0x8FA9xx
.iiPreprocessed C++Always accepted0x8FBF7E
.cupCUDA sourceOnly after --orig_src_path_name or --orig_src_file_name0x8FBFC4
.optixirOptiX IRAlways accepted0x8FC001

Unrecognized arguments (those failing both tree lookup and sequential matching, and lacking a recognized extension) are silently appended to the v266 pass-through vector, which is forwarded to sub-pipelines.

If no input file is found after parsing all arguments:

Missing input file
Recognized input file extensions are: .bc .ci .i .cup .optixir

Note that .ii is not mentioned in the error message despite being accepted -- this appears to be a minor oversight in the error string.

Output Modes

cicc can produce several output formats, controlled by the combination of flags in the a13 compilation mode bitmask. The bitmask is accumulated during flag parsing in sub_9624D0:

a13 ValueModeOutput Format
0x07Default (all phases)PTX text assembly
0x10Debug/line-infoPTX with debug metadata
0x21-gen-ltoLTO bitcode (.lto.bc)
0x23-lto (full LTO)LTO bitcode + link
0x26-link-ltoLinked LTO output
0x43--emit-optix-irOptiX IR (.optixir)
0x80-gen-opt-ltoOptimized LTO bitcode
0x100--nvvm-6464-bit NVVM mode modifier
0x200--nvvm-3232-bit NVVM mode modifier

The default output is PTX text, written through the LLC backend's PTX printer. The output file path is specified by -o <file> (fatal if missing in multi-stage modes). When no output path is provided in simple mode, sub_900130 constructs a .ptx filename from the input.

PTX Text Output (Default)

The standard path runs all four internal phases: LNK (IR linking), OPT (NVVM optimizer), optionally OptiX IR emission, then LLC (code generation). The LLC backend writes PTX assembly text to the output file. In sub_905EE0, the output writing (Phase 4) checks the first bytes of the result for ELF magic (0x7F, 0xED) to detect accidentally binary output; if the mode is text mode (0) and ELF headers are present, it indicates an internal error.

LTO Bitcode Output

When -lto or -gen-lto is active, cicc produces LLVM bitcode instead of PTX. The -gen-lto flag sets a13 = (a13 & 0x300) | 0x21 and adds -gen-lto to the LTO argument vector. The -gen-lto-and-llc variant additionally runs LLC after producing the LTO bitcode, generating both outputs. The -olto flag takes a next argument (the LTO optimization level) and combines LTO bitcode generation with LLC execution.

OptiX IR Output

The --emit-optix-ir flag sets a13 = (a13 & 0x300) | 0x43. In the flag translation tree, it also injects --emit-lifetime-intrinsics into the EDG frontend, enabling lifetime intrinsic emission that is required for the OptiX IR format. In the flag catalog (sub_9624D0), it additionally routes -do-ip-msp=0 and -do-licm=0 to the optimizer, disabling interprocedural memory space promotion and LICM for OptiX compatibility.

Split Compile

The -split-compile=N flag (or -split-compile-extended=N) routes to the optimizer as -split-compile=<N> (or -split-compile-extended=<N>). These are mutually exclusive and error if specified more than once ("split compilation defined more than once"). When -split-compile-extended is used, it also sets the flag at a1+1644 to 1. The split compile mechanism divides the compilation unit into N partitions for parallel processing.

Exit Codes

The process exit code is the return value of sub_8F9C90 (real main), stored in v8:

CodeMeaningSource
0SuccessNormal compilation; -irversion query
1Argument errorMissing input file, missing output file, CLI parse failure
v264Pipeline errorReturn code from sub_905EE0 / sub_1265970 / sub_905880

Within the pipeline, error codes from sub_905EE0 are set via *a8:

*a8 ValueMeaning
0Success (NVVM_SUCCESS)
-1File open/read error
1NVVM_ERROR_OUT_OF_MEMORY
4NVVM_ERROR_INVALID_INPUT
5NVVM_ERROR_INVALID_CU (null compilation unit)

Error messages are written to qword_4FD4BE0 (stderr stream) via sub_223E0D0. All LibNVVM-originated errors are prefixed with "libnvvm : error: ". Representative errors:

  • "Error processing command line: <cmd>" (from sub_900130 failure)
  • "Missing input file" / "Missing output file"
  • "<src>: error in open <file>" (file I/O)
  • "libnvvm: error: failed to create the libnvvm compilation unit"
  • "libnvvm: error: failed to add the module to the libnvvm compilation unit"
  • "libnvvm: error: failed to get the PTX output"
  • "Invalid NVVM IR Container" (error code 259, from sub_C63EB0)
  • "Error opening '<file>': file exists!" / "Use -f command line argument to force output"
  • "Error: Failed to write time profiler data."
  • "Unparseable architecture: <val>"
  • "libnvvm : error: <flag> is an unsupported option"
  • "libnvvm : error: <flag> defined more than once" (duplicate -maxreg, etc.)

Special Behaviors

.cup Extension Gate

The .cup extension (CUDA preprocessed source) is only accepted as an input file when the preceding argument is --orig_src_path_name or --orig_src_file_name. These are metadata flags inserted by nvcc to track the original source file path for diagnostic messages. The check is:

// At 0x8FBFC4 and 0x8FBFDE:
if (strcmp(argv[i-1], "--orig_src_path_name") == 0 ||
    strcmp(argv[i-1], "--orig_src_file_name") == 0) {
    s = argv[i];  // accept .cup as input
}

This means cicc will silently ignore a .cup file that appears without a preceding metadata flag. When accepted, the .cup extension triggers --orig_src_path_name / --orig_src_file_name handling in sub_900130, which forwards the original source path to the EDG frontend for accurate error location reporting.

-Ofc Alias Handling

The -Ofc=X form is a shorthand alias for -Ofast-compile=X, handled entirely within the sub_8FE280 flag translation tree. The tree contains six entries for fast-compile control:

Tree Keycicc InternalEffect
-Ofast-compile=max-Ofast-compile=maxIdentity
-Ofc=max-Ofast-compile=maxAlias
-Ofast-compile=mid-Ofast-compile=midIdentity
-Ofc=mid-Ofast-compile=midAlias
-Ofast-compile=min-Ofast-compile=minIdentity
-Ofc=min-Ofast-compile=minAlias
-Ofast-compile=0<null>No-op
-Ofc=0<null>No-op alias

The aliasing happens at the tree level, before sub_9624D0 ever sees the flag. By the time the flag catalog processes the argument, -Ofc=max and -Ofast-compile=max are indistinguishable. See Optimization Levels for what each fast-compile tier actually does.

In sub_9624D0, -Ofast-compile is stored at offset a1+1640 as an integer:

Level stringInteger valueBehavior
"0"1Disabled (then reset to 0)
"max"2Most optimizations skipped; forces -lsa-opt=0, -memory-space-opt=0
"mid"3Medium pipeline
"min"4Close to full optimization

Any other value produces: "libnvvm : error: -Ofast-compile called with unsupported level, only supports 0, min, mid, or max".

Only one -Ofast-compile is permitted per invocation. A second occurrence triggers: "libnvvm : error: -Ofast-compile specified more than once".

Discard Value Names

The -discard-value-names flag has complex interaction semantics. In the tree, it dual-maps to --discard_value_names=1 (EDG, note underscores) and -discard-value-names=1 (cicc, note hyphens). Additionally, per-phase overrides are possible via -Xopt -opt-discard-value-names=0, -Xlnk -lnk-discard-value-names=0, or -Xlto -lto-discard-value-names=0.

In CUDA mode, without explicit flags, value names are discarded by default. In OpenCL mode, the default only applies when LTO generation is active (a13 & 0x20). This reflects the fact that value names are useful for debugging but waste memory in production builds.

Wizard Mode Interaction

The -v (verbose), -keep (keep intermediates), and -dryrun flags are parsed in sub_8F9C90 but are only effective when wizard mode is active. Wizard mode is gated by getenv("NVVMCCWIZ") == 553282, which sets byte_4F6D280 = 1. Without wizard mode, these flags are silently accepted but have no effect -- v259 (verbose) and v262 (keep) remain 0. This is a deliberate anti-reverse-engineering measure.

Default Values When Flags Are Absent

When a flag is not explicitly provided, sub_9624D0 applies these defaults (checking stored-value sentinels):

FlagDefault ValueSentinel Offset
-opt=-opt=3 (O3)a1+400
-arch=compute_-arch=compute_75 (Turing)a1+560
-ftz=-ftz=0 (no flush-to-zero)a1+592
-prec-sqrt=-prec-sqrt=1 (CUDA) / -prec-sqrt=0 (OpenCL)a1+624
-prec-div=-prec-div=1 (precise)a1+656
-fma=-fma=1 (enabled)a1+688
-opt-fdiv=-opt-fdiv=0a1+464

Configuration

Four Output Vectors

sub_9624D0 builds four independent std::vector<std::string> that are serialized into char** arrays at function exit:

VectorSeedOutputPipeline Phase
v324 (LNK)"lnk"a5/a6Phase 1: IR linker
v327 (OPT)"opt"a7/a8Phase 2: NVVM optimizer
v330 (LTO)(none)a9/a10Phase 3: LTO passes
v333 (LLC)"llc"a11/a12Phase 4: Code generation

Each vector element is a 32-byte std::string with SSO. At exit, elements are serialized via malloc(8 * count) for the pointer array and malloc(len+1) + memcpy for each string.

Architecture Bitmask Validation

Architecture validation in sub_9624D0 uses a 64-bit bitmask 0x60081200F821:

offset = arch_number - 75;
if (offset > 0x2E || !_bittest64(&0x60081200F821, offset))
    // error: "is an unsupported option"

Valid architectures (bit positions): SM 75, 80, 86, 87, 88, 89, 90, 100, 103, 110, 120, 121. The a/f sub-variants share the base SM number for bitmask validation but receive distinct routing in sub_95EB40.

Compilation Mode Flags Bitmask (a13)

The a13 parameter in sub_9624D0 is an IN/OUT bitmask tracking compilation mode:

Bit/MaskSource FlagMeaning
0x07(default)Phase control: all phases active
0x10-g, --generate-line-infoDebug/line-info enabled
0x20-gen-lto, -gen-lto-and-llcLTO generation enabled
0x21-gen-ltoGen-LTO mode
0x23-ltoFull LTO mode
0x26-link-ltoLink-LTO mode
0x43--emit-optix-irOptiX IR emission mode
0x80-gen-opt-ltoOptimized LTO lowering
0x100--nvvm-6464-bit NVVM mode
0x200--nvvm-3232-bit NVVM mode
0x300(mask)64/32-bit mode bits mask

Function Map

FunctionAddressSizeRole
sub_8F9C900x8F9C9010,066 BReal main entry point
sub_8FE2800x8FE280~35 KBFlag translation tree builder (nvcc -> cicc)
sub_8FE1500x8FE150--Tree lookup (lower_bound + insert)
sub_8FDFD00x8FDFD0--Tree insert + rebalance
sub_8FD0D00x8FD0D0--Architecture flag scanner (first pass)
sub_9001300x90013039 KBCLI processing Path A (12 params)
sub_902D100x902D10~9 KBPath A orchestrator
sub_9044500x904450--Push flag to argument vector
sub_9058800x905880~6 KBEDG frontend stage
sub_905EE00x905EE043 KBPath A multi-stage pipeline driver
sub_9082200x908220--LLC output callback (ID 56993)
sub_9088500x908850--Triple construction (nvptx64-nvidia-cuda)
sub_9085A00x9085A0--OPT output callback (ID 64222)
sub_95EB400x95EB4038 KB3-column architecture mapping table builder
sub_9624D00x9624D075 KBFlag catalog (4 output vectors, ~111 flags)
sub_12628600x1262860--Path B simple dispatch
sub_12659700x126597048 KBPath B multi-stage pipeline driver

Global Variables

AddressVariablePurpose
qword_4F6D2A0Flag tree rootstd::map root for sub_8FE280
dword_4F6D2A8Flag tree sentineltree.end()
qword_4F6D2B0Flag tree root nodeRoot node pointer
qword_4F6D2B8Flag tree beginLeftmost node (begin iterator)
qword_4F6D2C8Init guardSet to 1 after sub_8FE280 first call
byte_4F6D2D0Partial-link flagSet by --partial-link
byte_4F6D2DCLLP64 flagSet by --force-llp64
unk_4F06A68Data model width8 = 64-bit, 4 = 32-bit
unk_4D0461CAddress space 3 flagEnables p3:32:32:32 in datalayout
byte_4F6D280Wizard modeSet by NVVMCCWIZ=553282

Cross-References