i was watching a tutorial about using unix utilities the guy was using it on a MAC i had a windows laptop so i downloaded Gnuwin32 Package
then came a part where i want to replace any non letter character in a file with a newline "\n"
the command line in the tutorial was :
tr -sc 'A-Za-z' '\n' < filename.txt |less
it worked with him but when i tried it it put a singleqoute "'" character after character
tr -sc "A-Za-z" "\n" < filename.txt |less
it added a new line after each character
i tried to remove the compliment option and add ^ in the regex
tr "[^A-Za-z]" "\n" < filename.txt |less
the result was replacing every
letter with a
the Question is does Command line options in UNIX utilities of GNUwin32 differ than others ? and does putting the regex between single quotes like 'A-Z' differ than "A-Z"
and if so what would be the best answer to replace every non-letter character with a newline , other than the failed trials above
the source of the text i was trying on
I tested your examples in my
tr --version (GNU coreutils) 8.5 and
1) using single or double quotes makes no difference 2) looks like there is no way to negate characters by using ^
When you write
[^A-Za-z] all these chars are treated literally:
echo "abc abd [hh] d^o 1976" | tr '[^A-Za-z]' '.'
or with double quotes
echo "abc abd [hh] d^o 1976" | tr "[^A-Za-z]" '.'
produces the following output
... ... .... ... 1976
Which proves that all aphabetic chars, the caret and square brackets have been treated literally and replaced.
This leads us to the conclusion that to split by non-alphabetic chars you have to use
-c with a range
'A-Za-z', exactly as you did in the first example.
$ tr -sc '[A-Za-z]' "\n" < getCokeInfo_viaFinger_cmu.awk bin gawk f BEGIN wisc edu finger ....
Note that I used char-class (
[A-Za-z] ). Maybe your
tr requires that too.
I hope this helps.
cat file.txt | sed -re 's/[^a-zA-Z]/\n/g'