Document that tr(1) was written for US-ASCII and may not work as
expected on other character sets which do not share ASCII's properties (e.g. a symmetric set of capital and lower case characters), per PR 18738 Change all double quotes to nroff macros. Change "System V" references to the .At macro.
This commit is contained in:
parent
afedcd8968
commit
c223370599
114
usr.bin/tr/tr.1
114
usr.bin/tr/tr.1
|
@ -1,4 +1,4 @@
|
|||
.\" $NetBSD: tr.1,v 1.13 2003/08/07 11:16:47 agc Exp $
|
||||
.\" $NetBSD: tr.1,v 1.14 2004/03/24 06:35:53 fair Exp $
|
||||
.\"
|
||||
.\" Copyright (c) 1991, 1993
|
||||
.\" The Regents of the University of California. All rights reserved.
|
||||
|
@ -32,7 +32,7 @@
|
|||
.\"
|
||||
.\" @(#)tr.1 8.1 (Berkeley) 6/6/93
|
||||
.\"
|
||||
.Dd June 6, 1993
|
||||
.Dd March 23, 2004
|
||||
.Dt TR 1
|
||||
.Os
|
||||
.Sh NAME
|
||||
|
@ -65,7 +65,12 @@ The following options are available:
|
|||
.It Fl c
|
||||
Complements the set of characters in
|
||||
.Ar string1 ,
|
||||
that is ``-c ab'' includes every character except for ``a'' and ``b''.
|
||||
that is
|
||||
.Qq \&-c \&ab
|
||||
includes every character except for
|
||||
.Qq \&a
|
||||
and
|
||||
.Qq \&b .
|
||||
.It Fl d
|
||||
The
|
||||
.Fl d
|
||||
|
@ -184,10 +189,16 @@ Class names are:
|
|||
\." and vice-versa) is specified in the same relative position in
|
||||
\." .Ar string1 .
|
||||
\." .Pp
|
||||
With the exception of the ``upper'' and ``lower'' classes, characters
|
||||
in the classes are in unspecified order.
|
||||
In the ``upper'' and ``lower'' classes, characters are entered in
|
||||
ascending order.
|
||||
With the exception of the
|
||||
.Qq upper
|
||||
and
|
||||
.Qq lower
|
||||
classes, characters in the classes are in unspecified order.
|
||||
In the
|
||||
.Qq upper
|
||||
and
|
||||
.Qq lower
|
||||
classes, characters are entered in ascending order.
|
||||
.Pp
|
||||
For specific information as to which ASCII characters are included
|
||||
in these classes, see
|
||||
|
@ -197,11 +208,14 @@ and related manual pages.
|
|||
Represents all characters or collating (sorting) elements belonging to
|
||||
the same equivalence class as
|
||||
.Ar equiv .
|
||||
If
|
||||
there is a secondary ordering within the equivalence class, the characters
|
||||
are ordered in ascending sequence.
|
||||
If there is a secondary ordering within the equivalence class, the
|
||||
characters are ordered in ascending sequence.
|
||||
Otherwise, they are ordered after their encoded values.
|
||||
An example of an equivalence class might be ``c'' and ``ch'' in Spanish;
|
||||
An example of an equivalence class might be
|
||||
.Qq \&c
|
||||
and
|
||||
.Qq \&ch
|
||||
in Spanish;
|
||||
English has no equivalence classes.
|
||||
.It [#*n]
|
||||
Represents
|
||||
|
@ -228,38 +242,67 @@ exits 0 on success, and \*[Gt]0 if an error occurs.
|
|||
.Sh EXAMPLES
|
||||
The following examples are shown as given to the shell:
|
||||
.sp
|
||||
Create a list of the words in file1, one per line, where a word is taken to
|
||||
be a maximal string of letters.
|
||||
Create a list of the words in
|
||||
.Ar file1 ,
|
||||
one per line, where a word is taken to be a maximal string of letters:
|
||||
.sp
|
||||
.D1 Li "tr -cs \*q[:alpha:]\*q \*q\en\*q \*[Lt] file1"
|
||||
.sp
|
||||
Translate the contents of file1 to upper-case.
|
||||
Translate the contents of
|
||||
.Ar file1
|
||||
to upper-case:
|
||||
.sp
|
||||
.D1 Li "tr \*q[:lower:]\*q \*q[:upper:]\*q \*[Lt] file1"
|
||||
.sp
|
||||
Strip out non-printable characters from file1.
|
||||
Strip out non-printable characters from
|
||||
.Ar file1 :
|
||||
.sp
|
||||
.D1 Li "tr -cd \*q[:print:]\*q \*[Lt] file1"
|
||||
.Sh COMPATIBILITY
|
||||
System V has historically implemented character ranges using the syntax
|
||||
``[c-c]'' instead of the ``c-c'' used by historic
|
||||
.At V
|
||||
has historically implemented character ranges using the syntax
|
||||
.Qq [c-c]
|
||||
instead of the
|
||||
.Qq c-c
|
||||
used by historic
|
||||
.Bx
|
||||
implementations and
|
||||
standardized by POSIX.
|
||||
implementations and standardized by POSIX.
|
||||
.At V
|
||||
shell scripts should work under this implementation as long as
|
||||
the range is intended to map in another range, i.e. the command
|
||||
``tr [a-z] [A-Z]'' will work as it will map the ``['' character in
|
||||
.Pp
|
||||
.Ic "tr [a-z] [A-Z]"
|
||||
.Pp
|
||||
will work as it will map the
|
||||
.Qq \&[
|
||||
character in
|
||||
.Ar string1
|
||||
to the ``['' character in
|
||||
to the
|
||||
.Qq \&[
|
||||
character in
|
||||
.Ar string2 .
|
||||
However, if the shell script is deleting or squeezing characters as in
|
||||
the command ``tr -d [a-z]'', the characters ``['' and ``]'' will be
|
||||
included in the deletion or compression list which would not have happened
|
||||
under an historic System V implementation.
|
||||
Additionally, any scripts that depended on the sequence ``a-z'' to
|
||||
represent the three characters ``a'', ``-'' and ``z'' will have to be
|
||||
rewritten as ``a\e-z''.
|
||||
the command
|
||||
.Pp
|
||||
.Ic "tr -d [a-z]"
|
||||
.Pp
|
||||
the characters
|
||||
.Qq \&[
|
||||
and
|
||||
.Qq \&]
|
||||
will be included in the deletion or compression list which would
|
||||
not have happened under an historic
|
||||
.At V
|
||||
implementation.
|
||||
Additionally, any scripts that depended on the sequence
|
||||
.Qq a-z
|
||||
to represent the three characters
|
||||
.Qq \&a ,
|
||||
.Qq \&- ,
|
||||
and
|
||||
.Qq \&z
|
||||
will have to be rewritten as
|
||||
.Qq a\e-z .
|
||||
.Pp
|
||||
The
|
||||
.Nm
|
||||
|
@ -290,4 +333,19 @@ has less characters than
|
|||
.Ar string1
|
||||
is permitted by POSIX but is not required.
|
||||
Shell scripts attempting to be portable to other POSIX systems should use
|
||||
the ``[#*]'' convention instead of relying on this behavior.
|
||||
the
|
||||
.Qq [#*]
|
||||
convention instead of relying on this behavior.
|
||||
.Sh BUGS
|
||||
.Nm
|
||||
was originally designed to work with
|
||||
.Tn US-ASCII .
|
||||
Its use with character sets that do not share all the properties of
|
||||
.Tn US-ASCII ,
|
||||
e.g.
|
||||
a symmetric set of upper and lower case characters
|
||||
that can be algorithmically converted one to the other,
|
||||
may yield unpredictable results.
|
||||
.Pp
|
||||
.Nm
|
||||
should be internationalized.
|
||||
|
|
Loading…
Reference in New Issue