five/compiler/gengo at 4ce927a8de2a7c624acb108c2b95a1fedbb83017 - five

Files

CharlesKWON d5e15272d2 feat(charset): UTF-8 default string semantics with selectable charset

Five strings now operate in Unicode rune units by default. Core string
functions (LEN/CHR/ASC/SUBSTR/LEFT/RIGHT/AT/PADR/PADL) are charset-aware:
UTF-8 rune semantics by default, byte/charset semantics when a legacy
charset (CP949, CP1252, ...) is selected. Initial charset is settable via
FIVE_CHARSET / HB_CODEPAGE env vars; default UTF8.

- hbrtl/charset.go: charset state + Str* helpers + DecodeToUTF8/EncodeFromUTF8
  + RTL HB_GETCHARSET/HB_SETCHARSET/HB_CDPSELECT/HB_TRANSLATE (x/text htmlindex)
- compiler/gengo: inlined string intrinsics now call charset-aware hbrtl.Str*
  helpers instead of byte-based Go (they previously bypassed the RTL registry)
- compiler/analyzer: register HB_GETCHARSET/HB_SETCHARSET/HB_TRANSLATE as known
- hbrtl/regex.go: add HB_REGEX (array-of-submatches)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

2026-06-15 12:42:33 +09:00

emit_block.go

fix(pp,parser,gengo): pre-release blocker round (Wave 1)

2026-05-01 07:45:20 +09:00

emit_stmt.go

docs(gengo): explain why _v needs block scope in array compound-assign

2026-05-27 09:16:28 +09:00

folding.go

fix(pp,parser,gengo): pre-release blocker round (Wave 1)

2026-05-01 07:45:20 +09:00

gen_class.go

fix(compiler,hbrt,hbrdd,cli): pre-1.0 audit — 13 critical fixes