Awk/Sed/Tr Replace non alpha-numeric with hex value
Awk/Sed/Tr Replace non alpha-numeric with hex value
I've got a byte string and need to search and replace all characters not in the range A-Za-z0-9 to it's hexadecimal equivalent prepended by &#x and appended with ; preferably with the hexadecimal in upper-case. I can use Awk, Sed, and Tr. Any Awk/Sed/Tr vets out there? :/ Greatly appreciate it.
“...No rest, no peace...” ― Odin Vex
- Combuster
- Member
- Posts: 9301
- Joined: Wed Oct 18, 2006 3:45 am
- Libera.chat IRC: [com]buster
- Location: On the balcony, where I can actually keep 1½m distance
- Contact:
Re: Awk/Sed/Tr Replace non alpha-numeric with hex value
od is the appropriate tool for converting byte data into hex strings, and a simple pattern substitution on top of that can arrange for the corresponding html encoding. This does mean that even your alphanumerics will be encoded by default which might not necessarily be a problem for starters. It's part of coreutils on linux and as such just as readily available as tr.
The problem is that substitutions on their own can't perform other functions, and the awk manual lacks a function to convert between character ordinal and string, leaving you with the mindless expand-all-cases scenario without using that.
Of course, the task is sufficiently simple that just building a C app is quicker and more performing compared to having awk invoke a numerical conversion app for each character.
The problem is that substitutions on their own can't perform other functions, and the awk manual lacks a function to convert between character ordinal and string, leaving you with the mindless expand-all-cases scenario without using that.
Of course, the task is sufficiently simple that just building a C app is quicker and more performing compared to having awk invoke a numerical conversion app for each character.