I am confused regarding the way -s, -t, and -c options work in the tr command. When I do
echo I am a good boy | tr good badI get the output:
I am a bddd bdyThis is quite understandable, since o is repeated in good. The last possible change in place of o is d, and hence the output.
Now when I do
echo I am a good boy | tr -s good badthe output is
I am a bd bdyThe -s option is supposed to squeeze every repeated occurence of each character in set 1 into a single occurence and then change each character in set 1 into the corresponding character in set 2 which is in the same position.
So it should have been
I am a bad bay.Why the change?
Moreover, when I do
echo I am a good boy | tr -c good badI get dddddddgoodddodd
How does the -c option work for tr, referring to this example?
And finally: how to change myself from a good boy to a bad boy.... :) :P That is,
echo I am a good boy | tr <something> gives me the output as: I am a bad boy.
4 Answers
-s Switch: Squeeze (remove repeated characters)
echo i am a good boy | tr -s good badoutput:
i am a bd bdy
There are two things happening behind the scenes that make this happen. Firstly, if the second argument to tr is shorter than the first then the last character in the 2nd arg is repeated to make it the same length as the first. So the equivalent command is:
echo i am a good boy | tr -s good baddThe other thing that is happening is when characters in the first argument are repeated they overwrite any previous occurrence (I'm referring to the two oos in good). This makes the command now equivalent to:
echo i am a good boy | tr -s god bdd(the second o to d replacement overwrites the previous o to a replacement, making it redundant)
Without the -s switch the output would be
i am a bddd bdyWith the -s switch tr 'squeezes' any repeated characters that are listed in the last argument leaving the final output:
i am a bd bdy-c Switch: Complement
The -c switch is used to match the complement of the first argument (i.e. all characters not listed in arg 1). As a result, arg 1 will contain many letters (256-3). Now, the same thing happens to arg 2 as in the previous case: Arg 2's final character gets repeated to match the length or Arg 1. So the original statement:
echo i am a good boy | tr -c good badis equivalent to:
echo i am a good boy | tr abcefhijklmnp... baddddddddddd...(note the missing g, o and d in the first set, also note that d will replace every other character in the second set -- including the space character)
This is why i am a good boy turns into dddddddgoodddodd
More info here:
1Your understanding of -s is incorrect, it replaces repeated occurrences of characters in set 1 in the input with a single character. it does not modify the set, eg.
echo i am a good boy | tr -s god badgives
i am a bad bayThe -c option replaces set 1 with its complement (ie. the set of all characters not contained in set 1). You can use this to remove all but the specified characters for example.
echo i am a good boy | tr -cd gobdyoutputs
goodboy 1 The other answers covered tr's -s, -t, and -c options
but for completeness:
You are having trouble because you picked up the wrong tool.
tris for character transformationssedis for stream editing.
Since both good and bad are sequence of characters in the stream sed is a better match.
echo I am a good boy | <something> gives me the output as: I am a bad boy
$ echo I am a good boy | sed s/good/bad/g
I am a bad boyThe s/..../..../ is Substitute. Whatever matches the first regular expression will be replaced with the second one. The /g flag at the end is for Global replacement this way all occurrences will be replaced not just the first.
$ echo I am a good boy and a good boy is me. | sed s/good/bad/
I am a bad boy and a good boy is me.
$ echo I am a good boy and a good boy is me. | sed s/good/bad/g
I am a bad boy and a bad boy is me. yes. exactly!
tr -s replaces instances of repeated characters with a single character.
(via man page.)
so, it goes like this:
it converts good to bddd. repeated instances are 3 'd's.
so it replaces these three instances with a single instance.
that is it makes it bd. :)
1