Improve Unicode / UTF-8 documentation

This commit is contained in:
Albrecht Schlosser 2020-01-26 15:10:53 +01:00
parent f3724f7488
commit 30a868dc0f

View File

@ -5,9 +5,9 @@
This chapter explains how FLTK handles international
text via Unicode and UTF-8.
Unicode support was only recently added to FLTK and is
still incomplete. This chapter is Work in Progress, reflecting
the current state of Unicode support.
Unicode support was added to FLTK starting with version 1.3.0 and is
still incomplete but mostly functional. This chapter is Work in Progress,
reflecting the current state of Unicode support.
\section unicode_about About Unicode, ISO 10646 and UTF-8
@ -16,11 +16,11 @@ deliberately brief and provides just enough information for
the rest of this chapter.
For further information, please see:
- http://www.unicode.org
- http://www.iso.org
- http://en.wikipedia.org/wiki/Unicode
- http://www.cl.cam.ac.uk/~mgk25/unicode.html
- http://www.apps.ietf.org/rfc/rfc3629.html
- https://unicode.org
- https://iso.org
- https://en.wikipedia.org/wiki/Unicode
- https://www.cl.cam.ac.uk/~mgk25/unicode.html
- https://tools.ietf.org/html/rfc3629
\par The Unicode Standard
@ -33,7 +33,7 @@ and is supported by most of the major computing companies in the world.
Before Unicode, many different systems, on different platforms,
had been developed for encoding characters for different languages,
but no single encoding could satisfy all languages.
Unicode provides access to over 100,000 characters
Unicode provides access to over 130,000 characters
used in all the major languages written today,
and is independent of platform and language.
@ -78,7 +78,10 @@ U+10FFFF. The complete character set is sub-divided into \e planes.
used characters from previous encoding standards. Other planes
contain characters for specialist applications.
\todo Do we need this info about planes?
\todo FLTK 1.3 and later supports the full Unicode range (21 bits), but
there are a few exceptions, for instance binary shortcut values in menus
(\ref Fl_Shortcut) can only be used with characters from the BMP (16 bits).
This may be extended in a future FLTK version.
The UCS also defines various methods of encoding characters as
a sequence of bytes.
@ -139,6 +142,11 @@ some level of synchronisation and error detection.
</tr>
</table>
\note This table contains theoretical values outside the valid Unicode
range (<tt>U+000000 - U+10FFFF</tt>). Such values can only be returned by
conversion functions for illegal input values (see \ref unicode_illegals).
\par
Moving from ASCII encoding to Unicode will allow all new FLTK
@ -249,7 +257,7 @@ about error handling and return values.
\section unicode_fltk_calls FLTK Unicode and UTF-8 Functions
This section currently provides a brief overview of the functions.
This section provides a brief overview of the functions.
For more details, consult the main text for each function via its link.
int fl_utf8locale()