Convert LATIN LETTER L WITH STROKE to LATIN L

Because polish letter "ł" is not accented letter but separate character `normalize("NFD")` doesn't work so we need another approach.

See also:
 - https://www.fileformat.info/info/unicode/char/0142/index.htm
 - https://www.fileformat.info/info/unicode/char/0141/index.htm
This commit is contained in:
wszostak 2021-10-05 14:05:28 +02:00 committed by Wojciech Szostak
parent ae1b12c120
commit 825dd69100
2 changed files with 14 additions and 1 deletions

View file

@ -33,7 +33,9 @@ class RemoveDiacritics extends Operation {
*/
run(input, args) {
// reference: https://stackoverflow.com/questions/990904/remove-accents-diacritics-in-a-string-in-javascript/37511463
return input.normalize("NFD").replace(/[\u0300-\u036f]/g, "");
return input.normalize("NFD")
.replace(/\u0142/g, "l").replace(/\u0141/g, "L")
.replace(/[\u0300-\u036f]/g, "");
}
}

View file

@ -80,4 +80,15 @@ TestRegister.addTests([
},
],
},
{
name: "Remove Diacritics: polish letter ł",
input: "zażółć gęślą jaźń ZAŻÓŁĆ GĘŚLĄ JAŹŃ",
expectedOutput: "zazolc gesla jazn ZAZOLC GESLA JAZN",
recipeConfig: [
{
"op": "Remove Diacritics",
"args": []
},
],
},
]);