355 lines
8.0 KiB
Groff
355 lines
8.0 KiB
Groff
.\" $NetBSD: vis.3,v 1.23 2009/02/10 23:06:31 christos Exp $
|
|
.\"
|
|
.\" Copyright (c) 1989, 1991, 1993
|
|
.\" The Regents of the University of California. All rights reserved.
|
|
.\"
|
|
.\" Redistribution and use in source and binary forms, with or without
|
|
.\" modification, are permitted provided that the following conditions
|
|
.\" are met:
|
|
.\" 1. Redistributions of source code must retain the above copyright
|
|
.\" notice, this list of conditions and the following disclaimer.
|
|
.\" 2. Redistributions in binary form must reproduce the above copyright
|
|
.\" notice, this list of conditions and the following disclaimer in the
|
|
.\" documentation and/or other materials provided with the distribution.
|
|
.\" 3. Neither the name of the University nor the names of its contributors
|
|
.\" may be used to endorse or promote products derived from this software
|
|
.\" without specific prior written permission.
|
|
.\"
|
|
.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
|
|
.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
|
.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
|
|
.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
|
|
.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
|
|
.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
|
|
.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
|
|
.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
|
|
.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
|
|
.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
|
|
.\" SUCH DAMAGE.
|
|
.\"
|
|
.\" @(#)vis.3 8.1 (Berkeley) 6/9/93
|
|
.\"
|
|
.Dd February 10, 2009
|
|
.Dt VIS 3
|
|
.Os
|
|
.Sh NAME
|
|
.Nm vis ,
|
|
.Nm strvis ,
|
|
.Nm strvisx ,
|
|
.Nm svis ,
|
|
.Nm strsvis ,
|
|
.Nm strsvisx
|
|
.Nd visually encode characters
|
|
.Sh LIBRARY
|
|
.Lb libc
|
|
.Sh SYNOPSIS
|
|
.In vis.h
|
|
.Ft char *
|
|
.Fn vis "char *dst" "int c" "int flag" "int nextc"
|
|
.Ft int
|
|
.Fn strvis "char *dst" "const char *src" "int flag"
|
|
.Ft int
|
|
.Fn strvisx "char *dst" "const char *src" "size_t len" "int flag"
|
|
.Ft char *
|
|
.Fn svis "char *dst" "int c" "int flag" "int nextc" "const char *extra"
|
|
.Ft int
|
|
.Fn strsvis "char *dst" "const char *src" "int flag" "const char *extra"
|
|
.Ft int
|
|
.Fn strsvisx "char *dst" "const char *src" "size_t len" "int flag" "const char *extra"
|
|
.Sh DESCRIPTION
|
|
The
|
|
.Fn vis
|
|
function
|
|
copies into
|
|
.Fa dst
|
|
a string which represents the character
|
|
.Fa c .
|
|
If
|
|
.Fa c
|
|
needs no encoding, it is copied in unaltered.
|
|
The string is null terminated, and a pointer to the end of the string is
|
|
returned.
|
|
The maximum length of any encoding is four
|
|
characters (not including the trailing
|
|
.Dv NUL ) ;
|
|
thus, when
|
|
encoding a set of characters into a buffer, the size of the buffer should
|
|
be four times the number of characters encoded, plus one for the trailing
|
|
.Dv NUL .
|
|
The flag parameter is used for altering the default range of
|
|
characters considered for encoding and for altering the visual
|
|
representation.
|
|
The additional character,
|
|
.Fa nextc ,
|
|
is only used when selecting the
|
|
.Dv VIS_CSTYLE
|
|
encoding format (explained below).
|
|
.Pp
|
|
The
|
|
.Fn strvis
|
|
and
|
|
.Fn strvisx
|
|
functions copy into
|
|
.Fa dst
|
|
a visual representation of
|
|
the string
|
|
.Fa src .
|
|
The
|
|
.Fn strvis
|
|
function encodes characters from
|
|
.Fa src
|
|
up to the
|
|
first
|
|
.Dv NUL .
|
|
The
|
|
.Fn strvisx
|
|
function encodes exactly
|
|
.Fa len
|
|
characters from
|
|
.Fa src
|
|
(this
|
|
is useful for encoding a block of data that may contain
|
|
.Dv NUL Ns 's ) .
|
|
Both forms
|
|
.Dv NUL
|
|
terminate
|
|
.Fa dst .
|
|
The size of
|
|
.Fa dst
|
|
must be four times the number
|
|
of characters encoded from
|
|
.Fa src
|
|
(plus one for the
|
|
.Dv NUL ) .
|
|
Both
|
|
forms return the number of characters in dst (not including
|
|
the trailing
|
|
.Dv NUL ) .
|
|
.Pp
|
|
The functions
|
|
.Fn svis ,
|
|
.Fn strsvis ,
|
|
and
|
|
.Fn strsvisx
|
|
correspond to
|
|
.Fn vis ,
|
|
.Fn strvis ,
|
|
and
|
|
.Fn strvisx
|
|
but have an additional argument
|
|
.Fa extra ,
|
|
pointing to a
|
|
.Dv NUL
|
|
terminated list of characters.
|
|
These characters will be copied encoded or backslash-escaped into
|
|
.Fa dst .
|
|
These functions are useful e.g. to remove the special meaning
|
|
of certain characters to shells.
|
|
.Pp
|
|
The encoding is a unique, invertible representation composed entirely of
|
|
graphic characters; it can be decoded back into the original form using
|
|
the
|
|
.Xr unvis 3
|
|
or
|
|
.Xr strunvis 3
|
|
functions.
|
|
.Pp
|
|
There are two parameters that can be controlled: the range of
|
|
characters that are encoded (applies only to
|
|
.Fn vis ,
|
|
.Fn strvis ,
|
|
and
|
|
.Fn strvisx ) ,
|
|
and the type of representation used.
|
|
By default, all non-graphic characters,
|
|
except space, tab, and newline are encoded.
|
|
(See
|
|
.Xr isgraph 3 . )
|
|
The following flags
|
|
alter this:
|
|
.Bl -tag -width VIS_WHITEX
|
|
.It Dv VIS_SP
|
|
Also encode space.
|
|
.It Dv VIS_TAB
|
|
Also encode tab.
|
|
.It Dv VIS_NL
|
|
Also encode newline.
|
|
.It Dv VIS_WHITE
|
|
Synonym for
|
|
.Dv VIS_SP
|
|
\&|
|
|
.Dv VIS_TAB
|
|
\&|
|
|
.Dv VIS_NL .
|
|
.It Dv VIS_SAFE
|
|
Only encode "unsafe" characters.
|
|
Unsafe means control characters which may cause common terminals to perform
|
|
unexpected functions.
|
|
Currently this form allows space, tab, newline, backspace, bell, and
|
|
return - in addition to all graphic characters - unencoded.
|
|
.El
|
|
.Pp
|
|
(The above flags have no effect for
|
|
.Fn svis ,
|
|
.Fn strsvis ,
|
|
and
|
|
.Fn strsvisx .
|
|
When using these functions, place all graphic characters to be
|
|
encoded in an array pointed to by
|
|
.Fa extra .
|
|
In general, the backslash character should be included in this array, see the
|
|
warning on the use of the
|
|
.Dv VIS_NOSLASH
|
|
flag below).
|
|
.Pp
|
|
There are four forms of encoding.
|
|
All forms use the backslash character
|
|
.Ql \e
|
|
to introduce a special
|
|
sequence; two backslashes are used to represent a real backslash,
|
|
except
|
|
.Dv VIS_HTTPSTYLE
|
|
that uses
|
|
.Ql % ,
|
|
or
|
|
.Dv VIS_MIMESTYLE
|
|
that uses
|
|
.Ql = .
|
|
These are the visual formats:
|
|
.Bl -tag -width VIS_CSTYLE
|
|
.It (default)
|
|
Use an
|
|
.Ql M
|
|
to represent meta characters (characters with the 8th
|
|
bit set), and use caret
|
|
.Ql ^
|
|
to represent control characters see
|
|
.Pf ( Xr iscntrl 3 ) .
|
|
The following formats are used:
|
|
.Bl -tag -width xxxxx
|
|
.It Dv \e^C
|
|
Represents the control character
|
|
.Ql C .
|
|
Spans characters
|
|
.Ql \e000
|
|
through
|
|
.Ql \e037 ,
|
|
and
|
|
.Ql \e177
|
|
(as
|
|
.Ql \e^? ) .
|
|
.It Dv \eM-C
|
|
Represents character
|
|
.Ql C
|
|
with the 8th bit set.
|
|
Spans characters
|
|
.Ql \e241
|
|
through
|
|
.Ql \e376 .
|
|
.It Dv \eM^C
|
|
Represents control character
|
|
.Ql C
|
|
with the 8th bit set.
|
|
Spans characters
|
|
.Ql \e200
|
|
through
|
|
.Ql \e237 ,
|
|
and
|
|
.Ql \e377
|
|
(as
|
|
.Ql \eM^? ) .
|
|
.It Dv \e040
|
|
Represents
|
|
.Tn ASCII
|
|
space.
|
|
.It Dv \e240
|
|
Represents Meta-space.
|
|
.El
|
|
.Pp
|
|
.It Dv VIS_CSTYLE
|
|
Use C-style backslash sequences to represent standard non-printable
|
|
characters.
|
|
The following sequences are used to represent the indicated characters:
|
|
.Bd -unfilled -offset indent
|
|
.Li \ea Tn - BEL No (007)
|
|
.Li \eb Tn - BS No (010)
|
|
.Li \ef Tn - NP No (014)
|
|
.Li \en Tn - NL No (012)
|
|
.Li \er Tn - CR No (015)
|
|
.Li \es Tn - SP No (040)
|
|
.Li \et Tn - HT No (011)
|
|
.Li \ev Tn - VT No (013)
|
|
.Li \e0 Tn - NUL No (000)
|
|
.Ed
|
|
.Pp
|
|
When using this format, the nextc parameter is looked at to determine
|
|
if a
|
|
.Dv NUL
|
|
character can be encoded as
|
|
.Ql \e0
|
|
instead of
|
|
.Ql \e000 .
|
|
If
|
|
.Fa nextc
|
|
is an octal digit, the latter representation is used to
|
|
avoid ambiguity.
|
|
.It Dv VIS_OCTAL
|
|
Use a three digit octal sequence.
|
|
The form is
|
|
.Ql \eddd
|
|
where
|
|
.Em d
|
|
represents an octal digit.
|
|
.It Dv VIS_HTTPSTYLE
|
|
Use URI encoding as described in RFC 1738.
|
|
The form is
|
|
.Ql %xx
|
|
where
|
|
.Em x
|
|
represents a lower case hexadecimal digit.
|
|
.It Dv VIS_MIMESTYLE
|
|
Use MIME Quoted-Printable encoding as described in RFC 2045, only don't
|
|
break lines and don't handle CRLF.
|
|
The form is:
|
|
.Ql %XX
|
|
where
|
|
.Em X
|
|
represents an upper case hexadecimal digit.
|
|
.El
|
|
.Pp
|
|
There is one additional flag,
|
|
.Dv VIS_NOSLASH ,
|
|
which inhibits the
|
|
doubling of backslashes and the backslash before the default
|
|
format (that is, control characters are represented by
|
|
.Ql ^C
|
|
and
|
|
meta characters as
|
|
.Ql M-C ) .
|
|
With this flag set, the encoding is
|
|
ambiguous and non-invertible.
|
|
.Sh SEE ALSO
|
|
.Xr unvis 1 ,
|
|
.Xr vis 1 ,
|
|
.Xr unvis 3
|
|
.Rs
|
|
.%A T. Berners-Lee
|
|
.%T Uniform Resource Locators (URL)
|
|
.%O RFC1738
|
|
.Re
|
|
.Sh HISTORY
|
|
The
|
|
.Fa vis ,
|
|
.Fa strvis ,
|
|
and
|
|
.Fa strvisx
|
|
functions first appeared in
|
|
.Bx 4.4 .
|
|
The
|
|
.Fa svis ,
|
|
.Fa strsvis ,
|
|
and
|
|
.Fa strsvisx
|
|
functions appeared in
|
|
.Nx 1.5 .
|