Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Function Map

Every function in the cudafe++ binary that triggers an EDG assertion encodes three pieces of data in the assertion string: the source file path, the line number, and the enclosing function name. These strings survive in .rodata and cross-reference back to the compiled functions, providing a ground-truth mapping from binary address to EDG source file. This page catalogs that mapping for all 52 .c source files and 13 .h header files identified in the CUDA 13.0 build of cudafe++ (EDG 6.6).

The mapping was produced by extracting all string literals matching /dvs/p4/build/sw/rel/gpgpu/toolkit/r13.0/compiler/drivers/compiler/edg/EDG_6.6/src/*.c and *.h from the binary's .rodata section, then tracing their cross-references to determine which functions load each path. A function that references attribute.c in an assertion string was compiled from attribute.c. Functions that reference no source path at all (the "unmapped" pool) are either too small to contain assertions, are inlined from headers, or belong to the statically-linked C++ runtime.

Coverage Summary

CategoryFunctionsPercentage
Mapped via .c file paths2,12932.8%
Mapped via .h file paths only801.2%
Total mapped2,20934.1%
Unmapped in EDG region (0x403300--0x7E0000)2,90644.8%
C++ runtime / demangler (0x7E0000--0x829722)1,08516.7%
PLT stubs + init (0x402A18--0x403300)2834.4%
Total functions in binary6,483100%

The 2,906 unmapped functions in the EDG region include inlined header expansions (e.g., util.h vector/hash helpers, types.h type queries), small leaf functions below the assertion threshold, switch-table dispatch fragments, and functions from translation units compiled without assertions enabled (notably il_to_str.c display routines and parts of floating.c).

Binary Layout

The EDG .text region (0x403300--0x7E0000) has a three-part structure:

  1. Assert stub region (0x403300--0x408B40): 235 small __noreturn functions, one per assertion site. Each encodes a source file path, line number, and function name, then calls sub_4F2930 (the internal error handler). These stubs are sorted by source file name -- the linker grouped them from all 52 .c files into one contiguous block. 200 stubs map to .c files; the remaining 35 are from .h files inlined into .c compilation units.

  2. Constructor region (0x408B40--0x409350): 15 C++ static constructor functions (ctor_001 through ctor_015) that initialize global tables at program startup.

  3. Main body region (0x409350--0x7DFFF0): The bulk of the compiler. Source files are laid out roughly in alphabetical order by filename, a consequence of the linker processing object files in directory-listing order. The alphabetical ordering holds across the entire range: attribute.c starts at 0x409350, class_decl.c at 0x419280, progressing through to types.c at 0x7A4940, modules.c at 0x7C0C60, and floating.c at 0x7D0EB0.

Source File Address Table

The table below lists all 52 .c source files sorted by their main body start address. "Total Funcs" counts all functions referencing the file (stubs + main body). "Stubs" counts assert stubs in 0x403300--0x408B40. "Main Funcs" counts functions in the main body region.

#Source FileOriginTotal FuncsStubsMain FuncsMain Body StartMain Body EndSweep
1attribute.cEDG17771700x4093500x418F80P1.01
2class_decl.cEDG27392640x4192800x447930P1.01--02
3cmd_line.cEDG441430x44B2500x459630P1.02--03
4const_ints.cEDG4130x461C200x4659A0P1.03
5cp_gen_be.cEDG226252010x466F900x489000P1.03--04
6debug.cEDG2020x48A1B00x48A1B0P1.04
7decl_inits.cEDG19641920x48B3F00x4A1540P1.04--05
8decl_spec.cEDG883850x4A1BF00x4B37F0P1.05
9declarator.cEDG640640x4B39700x4C00A0P1.05
10decls.cEDG20752020x4C09100x4E8C40P1.05--06
11disambig.cEDG5140x4E9E700x4EC690P1.06
12error.cEDG511500x4EDCD00x4F8F80P1.06
13expr.cEDG538105280x4F98700x5565E0P1.07--08
14exprutil.cEDG299132860x5587200x583540P1.08--09
15extasm.cEDG7070x584CA00x585850P1.09
16fe_init.cEDG6150x585B100x5863A0P1.09
17fe_wrapup.cEDG2020x588D400x588F90P1.09
18float_pt.cEDG790790x5895500x594150P1.09--10
19folding.cEDG13991300x594B300x5A4FD0P1.10
20func_def.cEDG561550x5A51B00x5AAB80P1.10
21host_envir.cEDG192170x5AD5400x5B1E70P1.10
22il.cEDG358163420x5B28F00x5DFAD0P1.10--11d
23il_alloc.cEDG381370x5E06000x5E8300P1.11a--11e
24il_to_str.cEDG831820x5F7FD00x6039E0P1.11f--12
25il_walk.cEDG271260x603FE00x620190P1.12
26interpret.cEDG21652110x620CE00x65DE10P1.12--13
27layout.cEDG212190x65EA500x665A60P1.13
28lexical.cEDG14051350x6667200x689130P1.13--14
29literals.cEDG210210x68ACC00x68F2B0P1.14
30lookup.cEDG712690x68FAB00x69BE80P1.14
31lower_name.cEDG179111680x69C9800x6AB280P1.14--15
32macro.cEDG431420x6AB6E00x6B5C10P1.15
33mem_manage.cEDG9270x6B6DD00x6BA230P1.15
34nv_transforms.cNVIDIA1010x6BE3000x6BE300P1.15
35overload.cEDG28432810x6BE4A00x6EF7A0P1.15--16
36pch.cEDG233200x6F27900x6F5DA0P1.16
37pragma.cEDG280280x6F61B00x6F8320P1.16
38preproc.cEDG100100x6F9B000x6FC940P1.16
39scope_stk.cEDG18661800x6FE1600x7106B0P1.16--17
40src_seq.cEDG571560x710F100x718720P1.17
41statements.cEDG831820x7193000x726A50P1.17
42symbol_ref.cEDG422400x726F200x72CEA0P1.17
43symbol_tbl.cEDG17581670x72D9500x74B8D0P1.17--18
44sys_predef.cEDG351340x74C6900x751470P1.18
45target.cEDG110110x7525F00x752DF0P1.18
46templates.cEDG455124430x7530C00x794D30P1.18
47trans_copy.cEDG2020x796BA00x796BA0P1.18
48trans_corresp.cEDG886820x796E600x7A3420P1.18--19
49trans_unit.cEDG100100x7A3BB00x7A4690P1.19
50types.cEDG885830x7A49400x7C02A0P1.19
51modules.cEDG223190x7C0C600x7C2560P1.19
52floating.cEDG509410x7D0EB00x7D59B0P1.19

Totals: 5,338 cross-references across 52 .c files, resolving to 2,129 unique functions. With .h file references added, 2,209 unique functions are mapped.

Largest Source Files by Function Count

Source FileMain Body FuncsApproximate Code Size
expr.c528~373 KB (0x4F9870--0x5565E0)
templates.c443~282 KB (0x7530C0--0x794D30)
il.c342~185 KB (0x5B28F0--0x5DFAD0)
exprutil.c286~175 KB (0x558720--0x583540)
overload.c281~200 KB (0x6BE4A0--0x6EF7A0)
class_decl.c264~187 KB (0x419280--0x447930)
interpret.c211~241 KB (0x620CE0--0x65DE10)
decls.c202~165 KB (0x4C0910--0x4E8C40)
cp_gen_be.c201~141 KB (0x466F90--0x489000)
decl_inits.c192~91 KB (0x48B3F0--0x4A1540)

Header File Cross-References

Thirteen .h header files appear in assertion strings. These are headers that contain non-trivial inline functions or macros that expand to assertion-bearing code. When a function compiled from decls.c triggers an assertion whose __FILE__ is types.h, that assertion was inlined from types.h into the decls.c compilation unit.

#Header FileXrefsStubsMain FuncsAddress RangeInlined Into
1decls.h1010x4E08F0decls.c
2float_type.h630630x7D1C90--0x7DEB90floating.c
3il.h5230x52ABC0--0x6011F0expr.c, il.c, il_to_str.c
4lexical.h1010x68F2B0lexical.c / literals.c boundary
5mem_manage.h4040x4EDCD0error.c
6modules.h5050x7C1100--0x7C2560modules.c
7nv_transforms.h3030x432280--0x719D20class_decl.c, cp_gen_be.c, src_seq.c
8overload.h1010x6C9E40overload.c
9scope_stk.h4040x503D90--0x574DD0expr.c, exprutil.c
10symbol_tbl.h2110x7377D0symbol_tbl.c
11types.h174130x469260--0x7B05E0Many files (scattered type queries)
12util.h124101140x430E10--0x7C2B10All major .c files
13walk_entry.h510510x604170--0x618660il_walk.c

Notable Header Patterns

util.h is the most widely-included header, with 124 cross-references (114 in main body) spanning nearly the entire EDG .text region from 0x430E10 to 0x7C2B10. It provides generic container templates (dynamic arrays, hash tables, sorted sets) used by every major subsystem. The EDG linker inlined these templates into each compilation unit, creating many small util.h-attributed functions scattered across the binary.

float_type.h is concentrated in a single 52 KB block at 0x7D1C90--0x7DEB90, immediately after floating.c. It contains 63 template instantiations for IEEE 754 floating-point type operations (comparison, conversion, arithmetic) for each target floating-point width. These templates were instantiated in the floating.c compilation unit.

walk_entry.h contributes 51 functions in the tight range 0x604170--0x618660, all within the il_walk.c region. These are the per-entry-kind callback dispatch functions generated by preprocessor macros in the IL walker header.

nv_transforms.h is NVIDIA-specific. Its 3 cross-references appear in class_decl.c (sub_432280 at 0x432280), cp_gen_be.c (sub_47ECC0 at 0x47ECC0), and src_seq.c (sub_719D20 at 0x719D20). These are the integration points where NVIDIA's CUDA transform hooks are called from standard EDG code paths -- class definition processing, backend code generation, and source sequence ordering.

NVIDIA-Specific Files

nv_transforms.c

The only NVIDIA-authored .c file in the EDG source tree. Despite having only 1 mapped function via __FILE__ (sub_6BE300 at 0x6BE300), the sweep analysis of the 0x6BAE70--0x6BE4A0 range identified approximately 40 functions compiled from this file. The discrepancy exists because nv_transforms.c uses NVIDIA's own assertion macros (not EDG's standard internal_error path), so most functions do not reference the EDG-style __FILE__ string.

Functions confirmed in the nv_transforms.c region:

AddressIdentityPurpose
0x6BAE70nv_init_transformsZero all NVIDIA transform state at startup
0x6BAF70alloc_mem_block64 KB memory block allocator for NV region pools
0x6BB290reset_mem_stateEmergency OOM recovery -- clear memory tracking
0x6BB350init_memory_regionsBootstrap region 0 and region 1 with initial blocks
0x6BB790emit_device_lambda_wrapperGenerate __nv_dl_wrapper_t<> specialization
0x6BCC20emit_lambda_preambleInject lambda wrapper preamble declarations
0x6BD490emit_host_device_lambda_wrapperGenerate __nv_hdl_wrapper_t<> specialization
0x6BE300(mapped function)Single function with EDG-style __FILE__ reference

Key infrastructure in this file:

  • __nv_dl_wrapper_t<> / __nv_hdl_wrapper_t<> struct template generation
  • Host reference array emission (.nvHRKE, .nvHRKI, .nvHRDE, .nvHRDI, .nvHRCE, .nvHRCI)
  • Capture count bitmask tables: unk_1286980 (device) and unk_1286900 (host-device), 128 bytes each
  • Lambda-to-closure entity mapping via hash table at qword_12868F0

nv_transforms.h

NVIDIA's hook header, #include-d from three EDG source files. It declares the functions that bridge standard EDG processing to NVIDIA's CUDA transform layer. The three inclusion sites represent the three points where EDG's standard C++ frontend cedes control to NVIDIA-specific logic:

  1. class_decl.c (sub_432280 at 0x432280): Called during class definition processing to apply CUDA execution-space attributes to closure types and validate lambda capture constraints.

  2. cp_gen_be.c (sub_47ECC0 at 0x47ECC0): Called during backend code generation to emit CUDA-specific output constructs (device stubs, host reference arrays, registration calls).

  3. src_seq.c (sub_719D20 at 0x719D20): Called during source sequence processing to inject NVIDIA preamble declarations and wrapper type definitions into the correct position in the declaration order.

Unmapped Regions (Gap Analysis)

Several address ranges within the EDG .text region contain functions that could not be mapped to any source file via __FILE__ strings. The major gaps and their probable contents:

Gap RangeSizeProbable ContentEvidence
0x408B40--0x409350~2 KBStatic constructors (ctor_001--ctor_015)No source path; global table initializers
0x447930--0x44B250~13 KBclass_decl.c / cmd_line.c boundary helpersBetween confirmed ranges
0x459630--0x461C20~34 KBcmd_line.c tail + const_ints.c preambleUnmapped option handlers
0x5E8300--0x5F7FD0~87 KBIL display routines (il_to_str.c early body)No assertions (display-only code)
0x665A60--0x666720~3 KBlayout.c / lexical.c boundarySmall gap between confirmed ranges
0x689130--0x68ACC0~7 KBlexical.c tail + literals.c preambleToken/literal conversion helpers
0x6AB280--0x6AB6E0~1 KBlower_name.c / macro.c boundaryMangling helpers
0x6BA230--0x6BAE70~3 KBmem_manage.c / nv_transforms.c boundaryMemory infrastructure
0x6EF7A0--0x6F2790~12 KBoverload.c / pch.c boundaryOverload resolution helpers
0x6FC940--0x6FE160~6 KBpreproc.c / scope_stk.c boundaryPreprocessor tail
0x751470--0x7525F0~7 KBsys_predef.c / target.c boundaryPredefined macro infrastructure
0x7A4690--0x7A4940~1 KBtrans_unit.c / types.c boundaryTranslation unit helpers
0x7C2560--0x7D0EB0~59 KBType-name mangling / encoding for outputBetween modules.c and floating.c
0x7D1C90--0x7DEB90~52 KBfloat_type.h template instantiationsConfirmed via .h path strings
0x7DFFF0--0x82A000~304 KBC++ runtime, demangler, soft-float, EHStatically-linked libstdc++/libgcc

The largest unmapped gap within EDG code is the IL display region at 0x5E8300--0x5F7FD0 (87 KB). These functions were compiled from il_to_str.c but contain no assertions because the display/dump subsystem was built without assertion macros -- it is purely diagnostic code that formats IL trees to stdout.

The float_type.h block at 0x7D1C90--0x7DEB90 (52 KB) is technically mapped via .h cross-references but has no .c file attribution because the template instantiations carry only the header's __FILE__ path.

Alphabetical Ordering Observation

The files are laid out in the binary in rough alphabetical order, consistent with a build system that compiles object files in directory-listing order and a linker that processes them sequentially:

0x409350  attribute.c      (a)
0x419280  class_decl.c     (c)
0x44B250  cmd_line.c       (c)
0x461C20  const_ints.c     (c)
0x466F90  cp_gen_be.c      (c)
0x48A1B0  debug.c          (d)
0x48B3F0  decl_inits.c     (d)
0x4A1BF0  decl_spec.c      (d)
0x4B3970  declarator.c     (d)
0x4C0910  decls.c          (d)
0x4E9E70  disambig.c       (d)
0x4EDCD0  error.c          (e)
0x4F9870  expr.c           (e)
0x558720  exprutil.c       (e)
0x584CA0  extasm.c         (e)
0x585B10  fe_init.c        (f)
0x588D40  fe_wrapup.c      (f)
0x589550  float_pt.c       (f)
0x594B30  folding.c        (f)
0x5A51B0  func_def.c       (f)
0x5AD540  host_envir.c     (h)
0x5B28F0  il.c             (i)
0x5E0600  il_alloc.c       (i)
0x5F7FD0  il_to_str.c      (i)
0x603FE0  il_walk.c        (i)
0x620CE0  interpret.c      (i)
0x65EA50  layout.c         (l)
0x666720  lexical.c        (l)
0x68ACC0  literals.c       (l)
0x68FAB0  lookup.c         (l)
0x69C980  lower_name.c     (l)
0x6AB6E0  macro.c          (m)
0x6B6DD0  mem_manage.c     (m)
0x6BAE70  nv_transforms.c  (n)  [region start; mapped func at 0x6BE300]
0x6BE4A0  overload.c       (o)
0x6F2790  pch.c            (p)
0x6F61B0  pragma.c         (p)
0x6F9B00  preproc.c        (p)
0x6FE160  scope_stk.c      (s)
0x710F10  src_seq.c        (s)
0x719300  statements.c     (s)
0x726F20  symbol_ref.c     (s)
0x72D950  symbol_tbl.c     (s)
0x74C690  sys_predef.c     (s)
0x7525F0  target.c         (t)
0x7530C0  templates.c      (t)
0x796BA0  trans_copy.c     (t)
0x796E60  trans_corresp.c  (t)
0x7A3BB0  trans_unit.c     (t)
0x7A4940  types.c          (t)
0x7C0C60  modules.c        (m)  [breaks alphabetical order]
0x7D0EB0  floating.c       (f)  [breaks alphabetical order]

Two files break the alphabetical pattern: modules.c at 0x7C0C60 and floating.c at 0x7D0EB0. Both appear after types.c instead of in their expected positions (between mem_manage.c and nv_transforms.c for modules.c, between float_pt.c and folding.c for floating.c). This suggests these two files are compiled as separate translation units outside the main EDG source directory, or are added to the link line after the alphabetically-sorted EDG objects.

Data Source

All mappings were extracted from the binary's .rodata string table. The extraction command:

jq '[.[] | select(.value | test("/dvs/p4/.*\\.c$")) |
  {file: (.value | split("/") | last),
   xrefs: [.xrefs[].func] | length}
] | sort_by(.file)' cudafe++_strings.json

The full build path for every source file is:

/dvs/p4/build/sw/rel/gpgpu/toolkit/r13.0/compiler/drivers/compiler/edg/EDG_6.6/src/<filename>

Address ranges were verified against the 20 sweep reports (P1.01 through P1.20) produced during the binary analysis phase.