Convert invalid chars to numbers, not letters
Created by: dvkt
The way fix_identifier
encodes invalid characters is causing some collisions, I think because it's only using the alphabet.
For example, -
and _
both become r_U
.
This patch changes it so the raw character number is used instead. The mangled names are nastier, but there won't be collisions.
The reason I didn't just "fix" this in master is I'm wondering if you had any other ideas for how to solve for the collisions?
Currently:
one-two => VAR_ONEr_UTWO
one_two => VAR_ONEr_UTWO
one😎two => VAR_ONEr_Fr_Yr_Rr_HTWO
With this patch:
one-two => VAR_ONEc45_TWO
one_two => VAR_ONEc95_TWO
one😎two => VAR_ONEc4294967280_c4294967199_c4294967192_c4294967182_TWO
Here's a working example:
$ cat bug.lsc
DATA:
one-two is text
one_two is text
PROCEDURE:
store "okay" in one-two
store "not okay" in one_two
display "- " one-two crlf
display "_ " one-two crlf⏎
$ ldpl bug.lsc
LDPL: Compiling...
ldpl-temp.cpp:232:8: error: redefinition of 'VAR_ONEr_UTWO'
string VAR_ONEr_UTWO = "";
^
ldpl-temp.cpp:231:8: note: previous definition is here
string VAR_ONEr_UTWO = "";
^
1 error generated.
LDPL Error: compilation failed.
Then, with this patch:
$ ldpl bug.lsc
LDPL: Compiling...
* File(s) compiled successfully.
* Saved as bug-bin
$ ./bug-bin
- okay
_ okay