Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Token Kind Table

Every token produced by cudafe++'s lexer carries a 16-bit token kind stored in the global word_126DD58. There are exactly 357 token kinds, numbered 0 through 356, with names indexed from a read-only string pointer table at off_E6D240 in the .rodata segment. A parallel 357-entry byte array at byte_E6C0E0 maps each token kind to an operator-name index, used by the initialize_opname_kinds routine (sub_588BB0) to populate the operator name display table at qword_126DE00. A boolean stop-token table at qword_126DB48 + 8 (357 entries) marks which token kinds are valid synchronization points for error recovery in skip_to_token (sub_6887C0).

Token kind assignment follows a block scheme established by the EDG 6.6 frontend: operators and punctuation occupy the lowest range, followed by alternative tokens (C++ digraphs and named operators), C89 keywords, C99/C11 extensions, MSVC keywords, core C++ keywords, compiler internals, type-trait intrinsics, and finally the newest C++23/26 and extended-type additions at the top. CUDA-specific additions from NVIDIA occupy three dedicated slots (328--330) within the type-trait block, plus additional entries in the extended range. This ordering reflects the historical accretion of the C and C++ standards: each new standard appended its keywords at the end rather than filling gaps.

Key Facts

PropertyValue
Total token kinds357 (indices 0--356)
Name tableoff_E6D240 (357 string pointers in .rodata)
Operator-to-name mapbyte_E6C0E0 (357-byte index array)
Operator name display tableqword_126DE00 (48 string pointers, populated by sub_588BB0)
Stop-token tableqword_126DB48 + 8 (357 boolean entries)
Current token globalword_126DD58 (WORD)
Keyword registration functionsub_5863A0 (keyword_init, 1,113 lines, fe_init.c)
Keyword entry functionsub_7463B0 (enter_keyword)
GNU variant registrationsub_585B10 (enter_gnu_keyword)
Alternative token entrysub_749600 (registers named operator alternative)

Token Kind Ranges

RangeCountCategoryDescription
01SpecialEnd-of-file / no-token sentinel
1--3131Operators and punctuationCore operators (+, -, *, etc.) and delimiters ((, ), {, }, ;)
32--5120Operators (continued)Compound and remaining operators (<<, >>, ->, ::, ..., <=>)
52--7625Alternative tokens / digraphsC++ named operators (and, or, not) and digraphs (<%, %>, <:, :>)
77--10832C89 keywordsAll keywords from ANSI C89/ISO C90
109--13123C99/C11 keywordsrestrict, _Bool, _Complex, _Imaginary, character types
132--1365MSVC keywords__declspec, __int8--__int64
137--19963C++ keywordsCore C++ keywords plus C++11/14/17/20/23 additions
200--2067Compiler internalPreprocessor and internal token kinds
207--327121Type trait intrinsics__is_xxx / __has_xxx compiler intrinsic keywords
328--3303NVIDIA CUDA type traitsNVIDIA-specific lambda type-trait intrinsics
331--35626Extended types / recent additions_Float32--_Float128, C++23/26 features, scalable vector types

Complete Token Table

Operators and Punctuation (0--51)

These tokens are produced directly by the character-level scanner sub_679800 (scan_token). Multi-character operators are resolved by dedicated scanning functions in the 0x67ABB0--0x67BAB0 range.

KindNameC/C++ ConstructNotes
0<eof>End of fileSentinel / no-token marker
1<identifier>IdentifierAny non-keyword identifier
2<integer literal>Integer constantDecimal, hex, octal, or binary
3<floating literal>Floating-point constantFloat, double, or long double
4<character literal>Character constant'x', includes wide/u8/u16/u32
5<string literal>String literal"...", includes wide/u8/u16/u32/raw
6;SemicolonStatement terminator
7(Left parenthesisGrouping, function call
8)Right parenthesis
9,CommaSeparator, comma operator
10=Assignmenta = b
11{Left braceBlock/initializer open
12}Right braceBlock/initializer close
13+PlusAddition, unary plus
14-MinusSubtraction, unary minus
15*StarMultiplication, pointer dereference, pointer declarator
16/SlashDivision
17<Less-thanComparison, template open bracket
18>Greater-thanComparison, template close bracket
19&AmpersandBitwise AND, address-of, reference declarator
20?Question markTernary conditional
21:ColonLabel, ternary, bit-field width
22~TildeBitwise complement, destructor
23%PercentModulo
24^CaretBitwise XOR
25[Left bracketArray subscript, attributes [[
26.DotMember access
27]Right bracket
28!ExclamationLogical NOT
29|PipeBitwise OR
30->ArrowPointer member access
31++IncrementPre/post increment
32--DecrementPre/post decrement
33==EqualEquality comparison; also bitand alt-token for &
34!=Not-equalInequality comparison
35<=Less-or-equalComparison
36>=Greater-or-equalComparison
37<<Left shiftAlso compl alt-token for ~
38>>Right shiftAlso not alt-token for !
39+=Add-assignCompound assignment
40-=Subtract-assign
41*=Multiply-assign
42/=Divide-assign
43%=Modulo-assign
44<<=Left-shift-assign
45>>=Right-shift-assign
46&&Logical ANDAlso address of rvalue reference
47||Logical OR
48^=XOR-assignAlso not_eq alt-token for !=
49&=AND-assign
50|=OR-assignAlso xor alt-token for ^
51::Scope resolutionAlso bitor alt-token for |

Alternative Tokens and Digraphs (52--76)

C++ alternative tokens (ISO 14882 clause 5.5) and C/C++ digraphs. These are registered during keyword_init (sub_5863A0) via sub_749600 when in C++ mode (dword_126EFB4 == 2).

KindNameEquivalentNotes
52and&&Logical AND
53or||Logical OR
54->*->*Pointer-to-member via pointer
55.*.*Pointer-to-member via object
56......Ellipsis (variadic)
57<=><=>Three-way comparison (C++20)
58##Preprocessor stringification
59####Preprocessor token paste
60<%{Digraph for left brace
61%>}Digraph for right brace
62<:[Digraph for left bracket
63:>]Digraph for right bracket
64and_eq&=Bitwise AND-assign
65xor_eq^=Bitwise XOR-assign
66or_eq|=Bitwise OR-assign
67%:#Digraph for hash
68%:%:##Digraph for token paste
69--76(reserved)--Reserved for future alternative tokens

C89 Keywords (77--108)

Always registered unconditionally. These form the base keyword set present in every compilation mode.

KindNameC/C++ Construct
77autoStorage class (C89); type deduction (C++11)
78breakLoop/switch exit
79caseSwitch case label
80charCharacter type
81constConst qualifier
82continueLoop continuation
83defaultSwitch default label; defaulted function (C++11)
84doDo-while loop
85doubleDouble-precision float
86elseIf-else branch
87enumEnumeration
88externExternal linkage
89floatSingle-precision float
90forFor loop
91gotoUnconditional jump
92ifConditional
93intInteger type
94longLong integer modifier
95registerRegister storage hint (deprecated in C++17)
96returnFunction return
97shortShort integer modifier
98signedSigned integer modifier
99sizeofSize query operator
100staticStatic storage / internal linkage
101structStructure
102switchMulti-way branch
103typedefType alias (C-style)
104unionUnion type
105unsignedUnsigned integer modifier
106voidVoid type
107volatileVolatile qualifier
108whileWhile loop

C99/C11/C23 Keywords (109--131)

Gated on the C standard version at dword_126EF68 (values: 199901 = C99, 201112 = C11, 202311 = C23).

KindNameStandardC/C++ Construct
109inlineC99Inline function hint (already C++ keyword at 154)
110--118(reserved)----
119restrictC99Pointer restrict qualifier
120_BoolC99Boolean type (C-style)
121_ComplexC99Complex number type
122_ImaginaryC99Imaginary number type
123--125(reserved)----
126char16_tC++11/C2316-bit character type
127char32_tC++11/C2332-bit character type
128char8_tC++17/C23UTF-8 character type
129--131(reserved)----

MSVC Keywords (132--136)

Gated on dword_126EFB0 (Microsoft extensions enabled, language mode 2/MSVC).

KindNameMSVC Construct
132__declspecMSVC declaration specifier
133__int88-bit integer type
134__int1616-bit integer type
135__int3232-bit integer type
136__int6464-bit integer type

C++ Core Keywords (137--199)

Gated on C++ mode (dword_126EFB4 == 2). Some keywords within this range were added in C++11 through C++23 and have additional standard-version gates.

KindNameStandardC/C++ Construct
137boolC++98Boolean type
138trueC++98Boolean literal
139falseC++98Boolean literal
140wchar_tC++98Wide character type
141--149(reserved)----
142__attributeGNUGCC attribute syntax
143__builtin_types_compatible_pGNUGCC type compatibility test
144--149(reserved)----
150catchC++98Exception handler
151classC++98Class definition
152deleteC++98Deallocation; deleted function (C++11)
153friendC++98Friend declaration
154inlineC++98Inline function/variable
155newC++98Allocation expression
156operatorC++98Operator overload
157privateC++98Access specifier
158protectedC++98Access specifier
159publicC++98Access specifier
160templateC++98Template declaration
161thisC++98Current object pointer
162throwC++98Throw expression
163tryC++98Try block
164virtualC++98Virtual function/base
165(reserved)----
166const_castC++98Const cast expression
167dynamic_castC++98Dynamic cast expression
168(reserved)----
169exportC++98/20Export declaration (original C++98, revived for modules in C++20)
170exportC++20Module export (alternate registration slot)
171--173(reserved)----
174mutableC++98Mutable data member
175namespaceC++98Namespace declaration
176reinterpret_castC++98Reinterpret cast expression
177static_castC++98Static cast expression
178typeidC++98Runtime type identification
179usingC++98Using declaration/directive
180--182(reserved)----
183typenameC++98Dependent type name
184static_assertC++11Static assertion; also _Static_assert in C11
185decltypeC++11Decltype specifier
186__auto_typeGNUGCC auto type extension
187__extension__GNUGCC extension marker (suppress warnings)
188(reserved)----
189typeofC++23/GNUType-of expression
190typeof_unqualC++23Unqualified type-of expression
191--193(reserved)----
194thread_localC++11Thread-local storage; also _Thread_local in C11
195--199(reserved)----

Compiler Internal Tokens (200--206)

These tokens are used internally by the preprocessor and the token cache. They never appear in user-visible diagnostics.

KindNamePurpose
200<pp-number>Preprocessing number (not yet classified as integer or float)
201<header-name>Include file name (<file> or "file")
202<newline>Logical newline token (preprocessor directive boundary)
203<whitespace>Whitespace token (preprocessing mode only)
204<placemarker>Token-paste placeholder (empty argument in ##)
205<pragma>Pragma token (deferred for later processing)
206<end-of-directive>End of preprocessor directive

Type Trait Intrinsics (207--327)

These are compiler intrinsic keywords that implement the C++ type traits (from <type_traits>) without requiring template instantiation. They are registered during keyword_init with C++ standard version gating -- earlier traits (C++11) are always available in C++ mode, while newer traits (C++20, C++23, C++26) require the corresponding standard version at dword_126EF68. Some traits are MSVC-specific (gated on dword_126EFB0) or Clang-specific (gated on qword_126EF90).

The complete list of type-trait intrinsics, organized alphabetically within each sub-category:

Unary Type Predicates

KindNameStandardTests Whether...
207__is_classC++11Type is a class (not union)
208__is_enumC++11Type is an enumeration
209__is_unionC++11Type is a union
210__is_podC++11Type is POD (plain old data)
211__is_emptyC++11Type has no non-static data members
212__is_polymorphicC++11Type has at least one virtual function
213__is_abstractC++11Type has at least one pure virtual function
214__is_literal_typeC++11Type is a literal type (deprecated C++17)
215__is_standard_layoutC++11Type is standard-layout
216__is_trivialC++11Type is trivially copyable and has trivial default constructor
217__is_trivially_copyableC++11Type is trivially copyable
218__is_finalC++14Class is marked final
219__is_aggregateC++17Type is an aggregate
220__has_virtual_destructorC++11Type has a virtual destructor
221__has_trivial_constructorC++11Type has a trivial default constructor
222__has_trivial_copyC++11Type has a trivial copy constructor
223__has_trivial_assignC++11Type has a trivial copy assignment
224__has_trivial_destructorC++11Type has a trivial destructor
225__has_nothrow_constructorC++11Default constructor is noexcept
226__has_nothrow_copyC++11Copy constructor is noexcept
227__has_nothrow_assignC++11Copy assignment is noexcept
228__has_trivial_move_constructorC++11Type has a trivial move constructor
229__has_trivial_move_assignC++11Type has a trivial move assignment
230__has_nothrow_move_assignC++11Move assignment is noexcept
231__has_unique_object_representationsC++17Type has unique object representations
232__is_signedC++11Type is a signed arithmetic type
233__is_unsignedC++11Type is an unsigned arithmetic type
234__is_integralC++11Type is an integral type
235__is_floating_pointC++11Type is a floating-point type
236__is_arithmeticC++11Type is an arithmetic type
237nullptrC++11Null pointer literal (not a trait; shares range)
238__is_fundamentalC++11Type is a fundamental type
239__int128GNU128-bit integer type (not a trait; shares range)
240__is_scalarC++11Type is a scalar type
241__is_objectC++11Type is an object type
242__is_compoundC++11Type is a compound type
243__is_referenceC++11Type is an lvalue or rvalue reference
244constexprC++11Constexpr specifier (not a trait; shares range)
245constevalC++20Consteval specifier (not a trait; shares range)
246constinitC++20Constinit specifier (not a trait; shares range)
247_AlignofC11Alignment query (C11 spelling)
248_AlignasC11Alignment specifier (C11 spelling)
249__basesGCCDirect base classes (GCC extension)
250__direct_basesGCCNon-virtual direct base classes (GCC extension)
251__builtin_arm_ldrexClangARM load-exclusive intrinsic
252__builtin_arm_ldaexClangARM load-acquire-exclusive intrinsic
253__builtin_arm_addgClangARM MTE add-tag intrinsic
254__builtin_arm_irgClangARM MTE insert-random-tag intrinsic
255__builtin_arm_ldgClangARM MTE load-tag intrinsic
256__is_member_pointerC++11Type is a pointer to member
257__is_member_function_pointerC++11Type is a pointer to member function
258__builtin_shufflevectorClangClang vector shuffle intrinsic
259__builtin_convertvectorClangClang vector conversion intrinsic
260_NoreturnC11No-return function specifier
261__builtin_complexGNUGCC complex number construction
262_GenericC11Generic selection expression
263_AtomicC11Atomic type qualifier/specifier
264_NullableClangNullable pointer qualifier
265_NonnullClangNon-null pointer qualifier
266_Null_unspecifiedClangNull-unspecified pointer qualifier
267co_yieldC++20Coroutine yield expression
268co_returnC++20Coroutine return statement
269co_awaitC++20Coroutine await expression
270__is_member_object_pointerC++11Type is a pointer to data member
271__builtin_addressofGNUAddress-of without operator overload

EDG Internal Keywords (272--283)

These are not user-facing keywords. They are injected by the EDG frontend into synthesized declarations for built-in types, throw specifications, and vector types.

KindNamePurpose
272__edg_type__EDG internal type placeholder
273__edg_vector_type__SIMD vector type (GCC __attribute__((vector_size)) lowering)
274__edg_neon_vector_type__ARM NEON vector type
275__edg_scalable_vector_type__ARM SVE scalable vector type
276__edg_neon_polyvector_type__ARM NEON polynomial vector type
277__edg_size_type__Placeholder for size_t before it is typedef'd
278__edg_ptrdiff_type__Placeholder for ptrdiff_t before it is typedef'd
279__edg_bool_type__Placeholder for bool / _Bool
280__edg_wchar_type__Placeholder for wchar_t
281__edg_throw__Throw specification in synthesized declarations
282__edg_opnd__Operand reference in synthesized expressions
283(reserved)--

More Type Predicates and Binary Traits (284--327)

KindNameStandardTests Whether...
284__is_constC++11Type is const-qualified
285__is_volatileC++11Type is volatile-qualified
286__is_voidC++11Type is void
287__is_arrayC++11Type is an array
288__is_pointerC++11Type is a pointer
289__is_lvalue_referenceC++11Type is an lvalue reference
290__is_rvalue_referenceC++11Type is an rvalue reference
291__is_functionC++11Type is a function type
292__is_constructibleC++11Type is constructible from given args
293__is_nothrow_constructibleC++11Construction is noexcept
294requiresC++20Requires expression/clause
295conceptC++20Concept definition
296__builtin_has_attributeGNUTests if declaration has given attribute
297__builtin_bit_castC++20Bit cast intrinsic (std::bit_cast implementation)
298__is_assignableC++11Type is assignable from given type
299__is_nothrow_assignableC++11Assignment is noexcept
300__is_trivially_constructibleC++11Construction is trivial
301__is_trivially_assignableC++11Assignment is trivial
302__is_destructibleC++11Type is destructible
303__is_nothrow_destructibleC++11Destruction is noexcept
304__edg_is_deducibleEDGEDG internal: template argument is deducible
305__is_trivially_destructibleC++11Destruction is trivial
306__is_base_ofC++11First type is base of second (binary trait)
307__is_convertibleC++11First type is convertible to second (binary trait)
308__is_sameC++11Two types are the same (binary trait)
309__is_trivially_copy_assignableC++11Copy assignment is trivial
310__is_assignable_no_precondition_checkEDGAssignable without precondition validation
311__is_same_asClangAlias for __is_same (Clang compatibility)
312__is_referenceableC++11Type can be referenced
313__is_bounded_arrayC++20Type is a bounded array
314__is_unbounded_arrayC++20Type is an unbounded array
315__is_scoped_enumC++23Type is a scoped enumeration
316__is_literalC++11Alias for __is_literal_type
317__is_complete_typeEDGType is complete (not forward-declared)
318__is_nothrow_convertibleC++20Conversion is noexcept (binary trait)
319__is_convertible_toMSVCMSVC alias for __is_convertible
320__is_invocableC++17Callable with given arguments
321__is_nothrow_invocableC++17Call is noexcept
322__is_trivially_equality_comparableClangBitwise equality is equivalent
323__is_layout_compatibleC++20Types have compatible layouts
324__is_pointer_interconvertible_base_ofC++20Pointer-interconvertible base (binary trait)
325__is_corresponding_memberC++20Corresponding members in layout-compatible types
326__is_pointer_interconvertible_with_classC++20Member pointer is interconvertible with class pointer
327__is_trivially_relocatableC++26Type can be trivially relocated

NVIDIA CUDA Type Traits (328--330)

Three NVIDIA-specific type-trait intrinsics occupy dedicated token kinds. These are registered during keyword_init when GPU mode is active (dword_106C2C0 != 0) and participate in the same token classification pipeline as all other type traits. They are used internally by the CUDA frontend to detect extended lambda closure types during device/host separation.

KindNamePurpose
328__nv_is_extended_device_lambda_closure_typeTests whether a type is the closure type of an extended device lambda. Used during device code generation to identify lambda closures that require special treatment (wrapper function generation, address-space conversion).
329__nv_is_extended_host_device_lambda_closure_typeTests whether a type is the closure type of an extended host-device lambda (__host__ __device__). These lambdas require dual code generation paths and wrapper functions for both host and device.
330__nv_is_extended_device_lambda_with_preserved_return_typeTests whether a device lambda has an explicitly specified (preserved) return type rather than a deduced one. Affects how the compiler generates the wrapper function return type.

When extended lambdas are disabled, these traits are predefined as macros expanding to false:

// Fallback definitions in preprocessor preamble:
#define __nv_is_extended_device_lambda_closure_type(X) false
#define __nv_is_extended_host_device_lambda_closure_type(X) false
#define __nv_is_extended_device_lambda_with_preserved_return_type(X) false

Extended Types and Recent Additions (331--356)

These are the newest token kinds, added for extended floating-point types (ISO/IEC TS 18661-3) and recent C++23/26 features.

KindNameStandardC/C++ Construct
331_Float32TS 18661-332-bit IEEE 754 float
332_Float32xTS 18661-3Extended 32-bit float
333_Float64TS 18661-364-bit IEEE 754 float
334_Float64xTS 18661-3Extended 64-bit float
335_Float128TS 18661-3128-bit IEEE 754 float
336--340(reserved)----
341--356(recent additions)C++23/26Reserved for MSVC C++/CLI traits (__is_ref_class, __is_value_class, __is_interface_class, __is_delegate, __is_sealed, __has_finalizer, __has_copy, __has_assign, __is_simple_value_class, __is_ref_array, __is_valid_winrt_type, __is_win_class, __is_win_interface) and additional future extensions

Token Cache

The token cache provides lookahead, backtracking, and macro-expansion replay for C++ parsing. Tokens are stored in a linked list of cache entries, each 80--112 bytes depending on payload.

Cache Entry Layout

OffsetSizeFieldDescription
+08nextNext entry in linked list
+88source_positionEncoded file/line/column
+162token_codeToken kind (0--356)
+181cache_entry_kindPayload discriminator (see table below)
+204flagsToken classification flags
+244extra_flagsAdditional flags
+328extra_dataContext-dependent data
+40..variespayloadKind-specific data (40--72 bytes)

Cache Entry Kinds

Eight discriminator values select the payload interpretation at offset +40:

KindValuePayload ContentSizeDescription
identifier1Name pointer + 64-byte lookup result72Identifier with pre-resolved scope/symbol lookup. The 64-byte lookup result mirrors xmmword_106C380--106C3B0.
macro_def2Macro definition pointer8Reference to a macro definition for re-expansion. Dispatched to sub_5BA500.
pragma3Pragma datavariesPreprocessor pragma deferred for later processing
pp_number4Number text pointer8Preprocessing number not yet classified as integer or float
(reserved)5----Not observed in use
string6String data + encoding bytevariesString literal with encoding prefix information
(reserved)7----Not observed in use
concatenated_string8Concatenated string datavariesWide or multi-piece concatenated string literal

Cache Management Globals

AddressNameDescription
qword_1270150cached_token_rescan_listHead of list of tokens to re-scan (pushed back for lookahead)
qword_1270128reusable_cache_stackStack of reusable cache entry blocks
qword_1270148free_token_listFree list for recycling cache entries
qword_1270140macro_definition_chainActive macro definition chain
qword_1270118cache_entry_free_listFree list for allocate_token_cache_entry
dword_126DB74has_cached_tokensBoolean: nonzero when cache is non-empty

Cache Operations

AddressIdentityLinesDescription
sub_669650copy_tokens_from_cache385Copies cached preprocessor tokens for macro re-expansion (assert at lexical.c:3417)
sub_669D00allocate_token_cache_entry119Allocates from free list at qword_1270118, initializes fields
sub_669EB0create_cached_token_node83Creates and initializes cache node with source position
sub_66A000append_to_token_cache88Appends token to cache list, maintains tail pointer
sub_66A140push_token_to_rescan_list46Pushes token onto rescan stack at qword_1270150
sub_66A2C0free_single_cache_entry18Returns cache entry to free list

Keyword Registration

All keywords are registered during frontend initialization by sub_5863A0 (keyword_init / fe_translation_unit_init, 1,113 lines, in fe_init.c). The function calls sub_7463B0 (enter_keyword) for each keyword, passing the numeric token kind and the keyword string. GNU double-underscore variants (e.g., __asm and __asm__ for asm) are registered via sub_585B10 (enter_gnu_keyword), which automatically generates both __name and __name__ forms from a single root. Alternative tokens are registered via sub_749600.

Version Gating Architecture

Registration is conditional on a set of global configuration flags established during CLI processing:

AddressNameControlsValues
dword_126EFB4language_modeC vs C++ dialect1 = C (GNU default), 2 = C++
dword_126EF68cpp_standard_versionStandard version level199711 (C++98), 201103 (C++11), 201402 (C++14), 201703 (C++17), 202002 (C++20), 202302 (C++23)
dword_126EFACc_language_modeC mode flagBoolean
dword_126EFB0microsoft_extensionsMSVC keywordsBoolean
dword_126EFA8gnu_extensionsGCC keywordsBoolean
dword_126EFA4clang_extensionsClang keywordsBoolean
qword_126EF98gnu_versionGCC version thresholdEncoded: e.g., 0x9FC3 = GCC 4.0.3
qword_126EF90clang_versionClang version thresholdEncoded: e.g., 0x15F8F, 0x1D4BF

Registration Pattern

The pseudocode below shows the version-gated registration pattern reconstructed from sub_5863A0:

void keyword_init(void) {
    // C89 keywords -- always registered
    enter_keyword(77, "auto");
    enter_keyword(78, "break");
    enter_keyword(79, "case");
    // ... all C89 keywords ...
    enter_keyword(108, "while");

    // C99 keywords -- gated on C99+ standard
    if (c_standard_version >= 199901) {
        enter_keyword(119, "restrict");
        enter_keyword(120, "_Bool");
        enter_keyword(121, "_Complex");
        enter_keyword(122, "_Imaginary");
    }

    // C11 keywords
    if (c_standard_version >= 201112) {
        enter_keyword(184, "_Static_assert");
        enter_keyword(247, "_Alignof");
        enter_keyword(248, "_Alignas");
        enter_keyword(260, "_Noreturn");
        enter_keyword(262, "_Generic");
        enter_keyword(263, "_Atomic");
        enter_keyword(194, "_Thread_local");
    }

    // C++ mode keywords
    if (language_mode == 2) {  // C++ mode
        enter_keyword(137, "bool");
        enter_keyword(138, "true");
        enter_keyword(139, "false");
        enter_keyword(140, "wchar_t");
        enter_keyword(150, "catch");
        enter_keyword(151, "class");
        // ... all C++ core keywords ...
        enter_keyword(183, "typename");

        // Alternative tokens (C++ only)
        enter_alt_token(52, "and", /*len*/3);
        enter_alt_token(53, "or", 2);
        enter_alt_token(64, "and_eq", 6);
        // ... all alternative tokens ...

        // C++11 keywords
        if (cpp_standard_version >= 201103) {
            enter_keyword(244, "constexpr");
            enter_keyword(185, "decltype");
            enter_keyword(237, "nullptr");
            enter_keyword(126, "char16_t");
            enter_keyword(127, "char32_t");
            enter_keyword(184, "static_assert");
            enter_keyword(194, "thread_local");
        }

        // C++20 keywords
        if (cpp_standard_version >= 202002) {
            enter_keyword(245, "consteval");
            enter_keyword(246, "constinit");
            enter_keyword(267, "co_yield");
            enter_keyword(268, "co_return");
            enter_keyword(269, "co_await");
            enter_keyword(294, "requires");
            enter_keyword(295, "concept");
        }
    }

    // GNU extensions -- gated on gnu_extensions flag
    if (gnu_extensions) {
        enter_gnu_keyword(187, "__extension__");
        enter_gnu_keyword(186, "__auto_type");
        enter_gnu_keyword(142, "__attribute");
        enter_keyword(117, "__builtin_offsetof");
        enter_keyword(143, "__builtin_types_compatible_p");
        enter_keyword(239, "__int128");
        // ... all GNU extensions ...
    }

    // MSVC extensions
    if (microsoft_extensions) {
        enter_keyword(132, "__declspec");
        enter_keyword(133, "__int8");
        enter_keyword(134, "__int16");
        enter_keyword(135, "__int32");
        enter_keyword(136, "__int64");
    }

    // Type traits (C++11+, ~60 traits)
    if (language_mode == 2) {
        enter_keyword(207, "__is_class");
        enter_keyword(208, "__is_enum");
        // ... all type traits through 327 ...
    }

    // CUDA type traits (GPU mode)
    if (gpu_mode) {
        enter_keyword(328, "__nv_is_extended_device_lambda_closure_type");
        enter_keyword(329, "__nv_is_extended_host_device_lambda_closure_type");
        enter_keyword(330, "__nv_is_extended_device_lambda_with_preserved_return_type");
    }

    // Extended float types (GNU)
    if (gnu_extensions) {
        enter_keyword(331, "_Float32");
        enter_keyword(332, "_Float32x");
        enter_keyword(333, "_Float64");
        enter_keyword(334, "_Float64x");
        enter_keyword(335, "_Float128");
    }

    // Post-keyword init: scope setup, builtin registration
    // ...
}

GNU Double-Underscore Registration

sub_585B10 (enter_gnu_keyword, assert at fe_init.c:698) implements the pattern where a single keyword name is registered in two or three forms:

  • If name starts with _: registers name as-is and __name__ (e.g., _Bool stays, plus ___Bool__ if applicable)
  • Otherwise: registers __name and __name__ (e.g., asm produces __asm and __asm__)

The function uses a stack buffer of 49 characters maximum (name + 5 <= 0x31), prepends __ (encoded as 0x5F5F in little-endian), copies the name, and appends __ with a null terminator. Both variants call sub_7463B0 (enter_keyword) with the same token kind.

Operator Name Table

The operator name display table at qword_126DE00 maps operator kinds to printable names for diagnostics and error messages. It is populated by sub_588BB0 (initialize_opname_kinds) during fe_wrapup.c initialization.

The initialization loop iterates all 357 entries of byte_E6C0E0 (operator-to-name index), mapping each non-zero entry to the corresponding string from off_E6D240 (the token name table). Two special cases are hardcoded:

Operator KindDisplay NameSpecial Case
42()Function call operator (overridden from default)
43[]Array subscript operator (overridden from default)

Additionally, the array positions for new[] and delete[] are hardcoded separately, since these operator names do not correspond to single tokens.

The routine validates that all entries in the range qword_126DE08 through qword_126DF80 (the 48 operator name slots) are non-null, and panics with "initialize_opname_kinds: bad init of opname_names" if any gap is found.

Token State Globals

When a token is produced by the lexer, the following globals are populated:

AddressNameTypeDescription
word_126DD58current_token_codeWORD16-bit token kind (0--356)
qword_126DD38current_source_positionQWORDEncoded file/line/column
qword_126DD48token_text_ptrQWORDPointer to identifier/literal text
srctoken_start_positionchar*Start of token in input buffer
ntoken_text_lengthsize_tLength of token text
dword_126DF90token_flags_1DWORDClassification flags
dword_126DF8Ctoken_flags_2DWORDAdditional flags
qword_126DF80token_extra_dataQWORDContext-dependent payload
xmmword_106C380--106C3B0identifier_lookup_result4 x 128-bitSSE-packed lookup result (64 bytes, 4 XMM registers)

Cross-References