String strip
Calling this function on strings which containts "ê" character will broke other string functions like string.len and they won't return anything.
Lua Code:
Anyone know a hack to make this work properly? |
You could use a slightly less restrictive pattern that matches everything after the first dash:
lua Code:
You could also use the WoW-specific strsplit function: lua Code:
Note the extra set of parentheses to discard all return values except the first. |
@Choonster: There's nothing wrong with the gsub part. The problem is that Lua's string.len is not UTF8-aware. Blizzard does provide a strlenutf8 function that is UTF8-aware, however.
But the given example code should not work anyway, unless you've defined x somewhere else: Code:
print(string.gsub("PLAYERNAME-Aggra(Português)", "%-[%a+()]+", ""), string.len(x)) |
Quote:
Code:
PLAYERNAMEês) %a does not account for ê, only e or E (and other letters of course). |
Quote:
|
Quote:
Lua Code:
I think the string library should be always UTF8-ready. The Blizzard strxxx ones are not tho. |
Quote:
"PetName-ServerName <OwnerName-ServerName>" I havn't tried the built in strsplit but i don't think it would make a difference, usually the default Blizzard functions handle utf8 stuff even worse. |
I managed to make it work with a pattern like this:
Lua Code:
Returns: "PLAYERNAME", 10 The weird part is this function strips properly too, however it brokes the string.len: Lua Code:
Returns: "PLAYERNAME" And i only used the "é" pattern here to properly handle some French server names like: "Chants éternels". |
Quote:
é and ê are both two-byte characters that share their first byte. When you use é in the pattern, you're actually using \195\169. Since Lua's string functions operate on bytes rather than UTF-8 characters, the first byte (195) matches the first byte of ê (\195\170) and leaves behind the second byte (170), which is invalid by itself. When WoW's print function encounters this invalid byte, it simply ignores anything after it. This snippet escapes any bytes > 127 (the end of the ASCII-compatible section of UTF-8): lua Code:
|
Why not "%-[^ ]+"?
|
Quote:
It's just a bad method to escape similar characters with the same starting byte, at least for this case. |
Quote:
|
Quote:
Code:
PlayerName-ServerName |
Lua Code:
|
Quote:
Code:
local name = "PêtName-Aggra'mar(Português) <Ownêrname-Aggra'mar(Português)>" |
Quote:
PetName <OwnerName> There are 4 special server names which could interfere mostly: -Azjol-Nerub -Aggra(Português) -Blade's Edge -Marécage de Zangar |
Quote:
|
Quote:
Code:
gsub(name, "%-[^ >]+", "") |
Code:
gsub(name,"%-[^ >]+","") |
Quote:
|
All times are GMT -6. The time now is 09:42 AM. |
vBulletin © 2024, Jelsoft Enterprises Ltd
© 2004 - 2022 MMOUI