Page 1 of 1
Awk/Sed/Tr Replace non alpha-numeric with hex value
Posted: Sun Oct 21, 2012 10:10 am
by OdinVex
I've got a byte string and need to search and replace all characters not in the range A-Za-z0-9 to it's hexadecimal equivalent prepended by &#x and appended with ; preferably with the hexadecimal in upper-case. I can use Awk, Sed, and Tr. Any Awk/Sed/Tr vets out there? :/ Greatly appreciate it.
Re: Awk/Sed/Tr Replace non alpha-numeric with hex value
Posted: Sun Oct 21, 2012 1:06 pm
by JamesM
Can you not use perl -e ?
Re: Awk/Sed/Tr Replace non alpha-numeric with hex value
Posted: Mon Oct 22, 2012 4:26 am
by Combuster
od is the appropriate tool for converting byte data into hex strings, and a simple pattern substitution on top of that can arrange for the corresponding html encoding. This does mean that even your alphanumerics will be encoded by default which might not necessarily be a problem for starters. It's part of coreutils on linux and as such just as readily available as tr.
The problem is that substitutions on their own can't perform other functions, and the awk manual lacks a function to convert between character ordinal and string, leaving you with the mindless expand-all-cases scenario without using that.
Of course, the task is sufficiently simple that just building a C app is quicker and more performing compared to having awk invoke a numerical conversion app for each character.