NetBSD/gnu/dist/gettext/README.gemtext

Revision: 1.1
Date: 1997/05/01 10:44:54
20
!!! README for gemtext.  !!!
If you think the above mentioned date is old check out
http://stud.uni-sb.de/~gufl0000/atari/gemtext.  You will find the latest
version there.  You will also find links to ftp sites from where you can
download gemtext via anonymous ftp.

See the file INSTALL for installing gemtext.
See the file COPYING for licensing.
See the file NEWS for information that is not covered by either this file
or the provided manual pages.  All information concerning features of
gemtext included in any text file that comes along with this package is
subject to change without notice!

==========================================================================

This is the gemtext package.  The gemtext package offers standardized i18n
(internationalization) features for GEM programs.  The package is intended
to help programmers to internationalize their programs.  If you aren't a
programmer you might want to read the file ABOUT-NLS to get an idea of
national language support.  But you won't need gemtext.

GEM is a registered trademark of Digital Research Company.  GEM is mainly
used on Atari computers or the various emulations available today on other
platforms.  You can also find the so-called PC-GEM for IBM compatible
machines.  If you neither use an Atari nor emulate the Atari's OS you
won't need gemtext too.  Still it might be a good idea to have a look at
the file ABOUT-NLS.

All programs and libraries contained in this package will work on their
own, only depending on some executables and libraries that can be found in
most Unix-like environments.  Yet, you will only benefit from gemtext if
you have the GNU gettext package or another internationalization package
that can be compared with gettext already installed. You will find the
latest version of GNU gettext at

  ftp://prep.ai.mit.edu/pub/gnu

It compiles without any problems with the GNU C-compiler and MiNTLib PL46.
(Well, there's a little problem:  If your msgfmt keeps on crashing with a
bus error try to replace the module obstack.o in your MiNTLib with the
obstack.c that comes along with GNU gettext.  But this is a bug in the
MiNTLib and not in GNU gettext.)

In the following I will refer to any computer that can make use of
xgemtext as 'Atari'.  I also expect you to be familiar with the C
programming language and with the well known data structures and function
calls that are necessary for programming GEM applications.

He, she, they... Quote from the gettext info file:
        "In this manual, we use *he* when speaking of the programmer or
        maintainer, *she* when speaking of the translator, and *they* when
        speaking of the installers or end users of the translated program.
        This is only a convenience for clarifying the documentation.  It
        is *absolutely* not meant to imply that some roles are more
        appropriate to males or females.  Besides, as you might guess, GNU
        `gettext' is meant to be useful for people using computers,
        whatever their sex, race, religion or nationality!"

Last but not least:  xgemtext is written in C++.  Hence you will need a
C++ compiler to compile the package.  GNU C++ is a good choice if you
don't already have one.  If you don't have the system resources to run the
compiler you will have to try to get an already compiled version.  Good
news:  The library (which is the more important part of the gemtext
package) is entirely written in plain C to allow easy porting to different
compiler systems.


CONTENTS
========
        1. Why gemtext?
        2. Alternatives for i18n
        3. How to use the GNU gettext package
        4. How to use the gemtext package
        5. The library rintl
        6. Charsets: Atari versus ISO-Latin 1
        7. Miscellanea


1. Why gemtext?
===============

The Atari's GUI (Graphical User Interface) offers possibilities for
native language support for a long time already.  All necessary graphical
data and text for the user interface is usually kept in a so-called
resource file.  For example the program foobar.app will normally look for
the resource file foobar.rsc in the same directory.  A programmer aware of
i18n will usually provide several of these resource file each of them
containing the entire data in different languages,  called e. g. en.rsc,
fr.rsc, it.rsc and so on.  The users have to rename the particular
resource file in their language to foobar.rsc and the program will still
behave the same but will display its messages in the chosen language.

This procedere is suboptimal for various reasons.

-  The most obvious one is that you keep loads of redundant data in the
   extra provided resouce files.  Usually the strings in the resource
   files occupy very little space in comparison to the graphical data.

-  GEM applications usually expect their resource files to reside at a
   fixed location and to have a fixed name.  Hence, in a multi-user
   environment you have to choose one particular language for each
   application or you have to keep different copies of both your resource
   and other data files.

-  Once you have decided to include your resources into your source code
   you have to say goodbye to the idea of i18n.

-  Probably the most important disadvantage of the current system is the
   effort you have to take in updating your applications.  Whenever you
   have changed the tree structure in your resource file you will have to
   edit every other resource file immediately if you do the translation
   yourself.  If you let somebody else do your translations she will have
   enormous difficulties.  She will have to click through the entire
   resource tree, taking care not to change the data structures
   unwillingly, and picking out the texts that have changed.  If you
   haven't found some trickier system it is probably easier to restart the
   translation from the beginning.

-  Whenever your application has to display a message you have to edit
   your resource file even for little ones like:

     form_alert (1, "[1][ Memory exhausted! ][ Damned ]");

   Every string that has to be translated has to appear in the resource
   file.  Considering that most resource file formats are restricted to 64
   kB of length and that most GEM libraries can handle only one single
   resource file this is often a severe restriction in the design of the
   GUI.


2. Alternatives for i18n
========================

As far as programs are concerned that have to be run via a command line
interpreter, there is already a portable, standardized system for message
internationalization, the GNU gettext package. It offers tools for
preparing all kinds of source files for i18n, tools that help both the
programmer and translator to keep track of updated versions (and the
maintainers, too) and it comes along with a free C library that gives
access to the translated messages at runtime of the program depending on
the setting of documented environment variables.

You should get a copy of the GNU gettext package now.  Yet, if you are too
curious to download it at once, just go ahead but you might want to reread
this file later as it refers to information that is available only with
the GNU gettext package.

The gemtext package should also be useful with NLS-packages (Native
Language Support) other than GNU gettext to the degree that GNU gettext
itself is compatible with other standardized packages.  Yet, it has been
only tested with GNU gettext as it is the only available one for Atari.


3. How to use the GNU gettext package
=====================================

Forget about the whole GUI thing for a short while to get roughly an idea
of programming with the GNU gettext package.  We'll come back to the GUI
shortly.

NOTE: This section isn't meant to be an exhaustive description of the GNU
gettext package.  It is intended to give you an overview.  If you already
know the GNU gettext package you might safely skip this section.

Localizing messages in a C program is very easily done.  At first you need
some (usually three) lines of extra code to tell the NLS library functions
where to find the translated messages.  Then you have to go through your
code and decide for every string whether you want to get it translated or
not. Strings that should be translated have to be embedded in a function
call to gettext ().  Thus instead of writing:

   printf ("Hello world!\n");

you would now write

   printf (gettext ("Hello world!\n"));

When you're fed up with typing g-e-t-t-e-x-t over and over again you will
follow the habit of most of your colleagues and

   #define _(s) gettext (s)

well knowing that now only three extra keystrokes will make the rest of
the world happy.

Very handy, very easy, isn't it?  Unfortunately it doesn't work if you use
strings for example in initialized structures or arrays.  Really? Just to
make sure that you don't abandon the whole thing already, I will tell you
a secret:  It still works!  Read the gettext info file and follow the
nodes Sources -> Special cases.

After having marked all translatable strings in your sources you will want
to extract them for translation by

   xgettext *.c

producing a file messages.po that contains both the original strings and
space for the translations.  After the file being translated you will
compile it into a binary by a call to the program msgfmt.  This binary
file will then usually be moved to

   /usr/(local/)share/locale/de/LC_MESSAGES/foobar.mo

provided that your program has been translated into German ('de' is the
ISO-639 2-letter country code for Germany).

Now you set one of the envariables LANGUAGE, LANG or LC_ALL to 'de' and
recompile your sources.  Your linker will then complain about the
reference to 'gettext' not being resolved and you will fix the problem by
linking the program with the 'intl'-library that comes along with gettext
(and gemtext, too).  Surprise, surprise!  Your program has learned to say
hello to the world in yet another language.

After long and satisfying experiences with your chef-d'oeuvre you might
feel that it is not enough to say hello to the world.  You want to say
hello to the whole universe, too.  So what now?  Redo the whole thing?
No!  You simply update your sources and extract a new messages file.
Before translating the file entirely again you should have msgmerge have a
glimpse at both the original and the updated file.  The program msgmerge
is very clever and will probably guess what you have done.  It will
produce a new .po file that will already contain the strings you have
translated the last time.  It will also not cast away those strings that
you do not care to translate any longer.  It will comment them out
instead, thus allowing you to change your mind again later.  This feature
is also very handy for corrected misspellings or minor modifications to
strings.  If msgmerge encounters such a case it will mark the maybe
incorrect translation it guesses to be best as a "fuzzy" translation.  If
you agree with msgmerge, simply remove the "fuzzy" comment.

Well, that's it in brief.  Again, if you want to understand the GNU
gettext package in its whole powerfulness you can't help reading the
documentation.


4. How to use the gemtext package
=================================

Probably you have already seen that it's no longer necessary to put
strings for alert boxes or other free strings in your resource files.
What has worked with printf will work with form_alert, too.

And what about resources integrated in your source files?  Why not?  Most
resource construction sets offer the possibility to produce a C source
file as output and running a simple awk script on this file will already
do the whole job for you.  (An important advantage of integrated resources
is that you're not restricted to 64 kB anymore).

But if you don't bother to write all the routines to fix adresses of
object trees and so on you will estimate the help of xgemtext from this
package.

Running

   xgemtext *.rsc

in your source directory will extract all strings from your resource files
placing them in a file messages.c looking more or less like this:

   /* This file was automatically produced by xgemtext <version> */

   #ifndef gettext_noop
   #define gettext_noop(s) (s)
   #endif

   const char* volatile nls_rsc_strings[] = {
           gettext_noop (" Foobar"),
           gettext_noop (" File"),
           gettext_noop (" Options"),
           gettext_noop (" About Foobar..."),
           gettext_noop (" Open"),
           ... /* All other strings in your resource.  */
           0L
   }

It looks like a real C source file and it actually is one.  Yet, usually
you will never compile it nor link the corresponding object file.  But you
never know...

Various options control xgemtext's behavior while extracting the strings
from your resources, see the manual page for detailed information.  Of
course it does not extract only strings from objects of type G_TEXT,
G_STRING etc. but from all kinds of objects containing text, including
icons and bitmaps.

So what's the purpose of messages.c if not compiling?  You can run
it (together with your other sources) through xgettext and now you have
your resource strings in a .po file that can be processed with all the
tools from the GNU gettext package.


5. The library rintl
====================

Your resource file hasn't been patched by xgemtext.  So, how to get the
translated strings into the objects where they should go?  You load your
resource as usual.  Now you simply have to take the habit to call
gettree() for each object tree after fixing the object addresses via
rsrc_gaddr(). The library function gettree() will walk thru the specified
object tree replacing each pointer to a string with a pointer to the
translated versions.  It will also modify some members of the internal
data structures (string length etc.) to the appropriate values.

For simple objects this is all that you have to do.  For more complicated
ones translating the strings might leave your carefully designed objects a
mess, beginning with menues.  Remember that all string lengths may have
been arbitrarily changed, resulting in objects that might hide each other
or ugly looking gaps between objects that were meant to be close to each
other.

You will find numerous functions in the library rintl that will try to
bring your objects back to a tolerable outer appearance again, each of
them being called rnl<object_type>().  You will also find a function
called rnltree() that will try to do the job for entire object trees.  You
can control the behavior of these adapt functions in numerous ways that
should meet as many requirements as possible.  You can specify spaces to
be left between objects, margins to the tree borders, minimum widths of
button objects and so on.  You can also tell the functions not to walk
thru the whole object tree but stopping at a certain level, thus allowing
you to process different parts of your objects each with individual values
for the control flags and variables.

Menus usually require special treatment.  rnlmenu() will usually to the
job in a satisfying manner.

As you might guess the result still can't be perfect.  Your objects might
need a little (or a whole lot of) fine tuning after being translated.
Special care is also expected from the translator.  She has to follow some
simple rules while doing her job.  You yourself, the programmer, should
follow some rules when designing your object trees.  The rnltree()
function will work best if it finds all object children in rows and
columns of constant height. This can be easily achieved by putting object
children belonging logically together into visible or invisible boxes.

For each library function you will find a manual page describing its
functionality in detail.

In the subdirectory ``example'' after building the binaries you will find
a simple example program showing all this in a simple manner.  Set the
envariable LANGUAGE to one of the values de (German), it (Italian), fr
(French), nl (Dutch), ga (Irish) or pt (portuguese) to select
your preferred language.  Setting the envariable to an unknown
value will make the example program choose the default values
``C'' (in this case English).

Select the menu entry ``International...'' to see a simple text displayed
in your selected language.  The entry ``Language...'' might allow you to
change the language even at runtime but this depends on the existence of
the putenv() function in your standard C library.  If putenv() is not
found the entry will be disabled. You have to change the language on the
command line.

In the ``Options'' menu you can follow the metamorphosis of a GEM dialog,
first without any treatment, then after a call to rnltree() and finally
after some additional beauty cures.  See the source and the manual
pages rnltree(3) and rnlpush(3) for details.


6. Charsets: Atari versus ISO-Latin 1
=====================================

There's a dilemma with the i18n thing.  On other platforms (especially of
the workstation sector) you can not only localize the language but also
the charset to utilize.  AFAIK there's no C library available that allows
you to do so on Atari.

A couple of years ago there was but one choice for a charset in the Atari
sector, the charset of the Atari's built-in system font.  Now more and
more Atari users have switched to the universally used ISO-Latin 1 charset
also known as i-8859-1, especially when using a command line interpreter.
If you use already internationalized GNU packages you will come across
this charset, too.  (Note that both ISO-Latin 1 and the Atari charset are
intended for use with western european languages, the Atari's charset
to a certain extent also for Hebrew and Greek.  Hence the following refers
only to those languages).

I have chosen the following way:  As it is very unlikely that you use an
ISO-Latin 1 charset with GEM, the files containing the messages for the
example program - which has a GUI - have to be read with the Atari
charset. But these files shouldn't go into your locale directory.  The
whole thing is provided as an example and should stay where it is (i. e.
it isn't installed).

For xgemtext you have the choice. If you use an ISO-Latin-1 font (as you
should do) simply follow the instructions in ABOUT-NLS.  If you insist on
using a font with an Atari codeset set the environment variable LANGUAGE
(or LC_ALL or LC_xxx) to <ll>.atarist where <ll> should specify the
language you want to use.  When installing xgemtext the appropriate
directories in your locale directory get created.  They will be called
<localedir>/<ll>.atarist/LC_MESSAGES.


7. Miscellanea
==============

Now something that might leave you a little uneasy:  The translating
project relies on the original programs being written in English.  Say you
have written your program in Rhaeto-Romance and you want to have it
translated into Irish.  Well, the diligent translator from the Aron
Islands is lucky enough to understand English and she probably won't
bother to learn Rhaeto-Romance.  The only possibility to solve the problem
is to agree upon one base language and this language is English.  If you
have made it until here through this document I'm sure you will manage to
write the messages in your sources and resources in English, too.

Of course you will have to explain your users how to benefit from the new
i18n features of your program.  Include the file ABOUT-NLS that is part
of the GNU gettext distribution.  This will tell your users the
necessary measures to take.

You should also include a copy of the file ATARI.NLS in your sources.
This file is intended to be read by translators.  They will find some
useful hints for translating messages for GEM programs.

Note that gemtext is *not* part of the GNU project!  Bug reports,
comments, compliments should be sent to the author, flames and criticism
to /dev/null.  Please also note that everything you find in the directory
intl *is* part of the GNU gettext package.  Thus, any comments concerning
files in this subdirectory should be directed to the maintainer of the GNU
gettext package.  There are some other files that consist partly or
entirely of copyrighted material by the Free Software Foundation.  If in
doubt check the header texts.

Hope you have fun with gemtext!

Guido Flohr <gufl0000@stud.uni-sb.de>.