1126 lines
48 KiB
Plaintext
1126 lines
48 KiB
Plaintext
This is Info file gettext.info, produced by Makeinfo version 1.68 from
|
||
the input file gettext.texi.
|
||
|
||
INFO-DIR-SECTION GNU Gettext Utilities
|
||
START-INFO-DIR-ENTRY
|
||
* Gettext: (gettext). GNU gettext utilities.
|
||
* gettextize: (gettext)gettextize Invocation. Prepare a package for gettext.
|
||
* msgfmt: (gettext)msgfmt Invocation. Make MO files out of PO files.
|
||
* msgmerge: (gettext)msgmerge Invocation. Update two PO files into one.
|
||
* xgettext: (gettext)xgettext Invocation. Extract strings into a PO file.
|
||
END-INFO-DIR-ENTRY
|
||
|
||
This file provides documentation for GNU `gettext' utilities. It
|
||
also serves as a reference for the free Translation Project.
|
||
|
||
Copyright (C) 1995, 1996, 1997 Free Software Foundation, Inc.
|
||
|
||
Permission is granted to make and distribute verbatim copies of this
|
||
manual provided the copyright notice and this permission notice are
|
||
preserved on all copies.
|
||
|
||
Permission is granted to copy and distribute modified versions of
|
||
this manual under the conditions for verbatim copying, provided that
|
||
the entire resulting derived work is distributed under the terms of a
|
||
permission notice identical to this one.
|
||
|
||
Permission is granted to copy and distribute translations of this
|
||
manual into another language, under the above conditions for modified
|
||
versions, except that this permission notice may be stated in a
|
||
translation approved by the Foundation.
|
||
|
||
|
||
File: gettext.info, Node: Entry Positioning, Next: Normalizing, Prev: Main PO Commands, Up: Basics
|
||
|
||
Entry Positioning
|
||
=================
|
||
|
||
The cursor in a PO file window is almost always part of an entry.
|
||
The only exceptions are the special case when the cursor is after the
|
||
last entry in the file, or when the PO file is empty. The entry where
|
||
the cursor is found to be is said to be the current entry. Many PO
|
||
mode commands operate on the current entry, so moving the cursor does
|
||
more than allowing the translator to browse the PO file, this also
|
||
selects on which entry commands operate.
|
||
|
||
Some PO mode commands alter the position of the cursor in a
|
||
specialized way. A few of those special purpose positioning are
|
||
described here, the others are described in following sections.
|
||
|
||
`.'
|
||
Redisplay the current entry.
|
||
|
||
`n'
|
||
`n'
|
||
Select the entry after the current one.
|
||
|
||
`p'
|
||
`p'
|
||
Select the entry before the current one.
|
||
|
||
`<'
|
||
Select the first entry in the PO file.
|
||
|
||
`>'
|
||
Select the last entry in the PO file.
|
||
|
||
`m'
|
||
Record the location of the current entry for later use.
|
||
|
||
`l'
|
||
Return to a previously saved entry location.
|
||
|
||
`x'
|
||
Exchange the current entry location with the previously saved one.
|
||
|
||
Any GNU Emacs command able to reposition the cursor may be used to
|
||
select the current entry in PO mode, including commands which move by
|
||
characters, lines, paragraphs, screens or pages, and search commands.
|
||
However, there is a kind of standard way to display the current entry
|
||
in PO mode, which usual GNU Emacs commands moving the cursor do not
|
||
especially try to enforce. The command `.' (`po-current-entry') has
|
||
the sole purpose of redisplaying the current entry properly, after the
|
||
current entry has been changed by means external to PO mode, or the
|
||
Emacs screen otherwise altered.
|
||
|
||
It is yet to be decided if PO mode helps the translator, or otherwise
|
||
irritates her, by forcing a rigid window disposition while she is doing
|
||
her work. We originally had quite precise ideas about how windows
|
||
should behave, but on the other hand, anyone used to GNU Emacs is often
|
||
happy to keep full control. Maybe a fixed window disposition might be
|
||
offered as a PO mode option that the translator might activate or
|
||
deactivate at will, so it could be offered on an experimental basis.
|
||
If nobody feels a real need for using it, or a compulsion for writing
|
||
it, we should drop this whole idea. The incentive for doing it should
|
||
come from translators rather than programmers, as opinions from an
|
||
experienced translator are surely more worth to me than opinions from
|
||
programmers *thinking* about how *others* should do translation.
|
||
|
||
The commands `n' (`po-next-entry') and `p' (`po-previous-entry')
|
||
move the cursor the entry following, or preceding, the current one. If
|
||
`n' is given while the cursor is on the last entry of the PO file, or
|
||
if `p' is given while the cursor is on the first entry, no move is done.
|
||
|
||
The commands `<' (`po-first-entry') and `>' (`po-last-entry') move
|
||
the cursor to the first entry, or last entry, of the PO file. When the
|
||
cursor is located past the last entry in a PO file, most PO mode
|
||
commands will return an error saying `After last entry'. Moreover, the
|
||
commands `<' and `>' have the special property of being able to work
|
||
even when the cursor is not into some PO file entry, and one may use
|
||
them for nicely correcting this situation. But even these commands
|
||
will fail on a truly empty PO file. There are development plans for
|
||
the PO mode for it to interactively fill an empty PO file from sources.
|
||
*Note Marking::.
|
||
|
||
The translator may decide, before working at the translation of a
|
||
particular entry, that she needs to browse the remainder of the PO
|
||
file, maybe for finding the terminology or phraseology used in related
|
||
entries. She can of course use the standard Emacs idioms for saving
|
||
the current cursor location in some register, and use that register for
|
||
getting back, or else, use the location ring.
|
||
|
||
PO mode offers another approach, by which cursor locations may be
|
||
saved onto a special stack. The command `m' (`po-push-location')
|
||
merely adds the location of current entry to the stack, pushing the
|
||
already saved locations under the new one. The command `r'
|
||
(`po-pop-location') consumes the top stack element and reposition the
|
||
cursor to the entry associated with that top element. This position is
|
||
then lost, for the next `r' will move the cursor to the previously
|
||
saved location, and so on until no locations remain on the stack.
|
||
|
||
If the translator wants the position to be kept on the location
|
||
stack, maybe for taking a look at the entry associated with the top
|
||
element, then go elsewhere with the intent of getting back later, she
|
||
ought to use `m' immediately after `r'.
|
||
|
||
The command `x' (`po-exchange-location') simultaneously reposition
|
||
the cursor to the entry associated with the top element of the stack of
|
||
saved locations, and replace that top element with the location of the
|
||
current entry before the move. Consequently, repeating the `x' command
|
||
toggles alternatively between two entries. For achieving this, the
|
||
translator will position the cursor on the first entry, use `m', then
|
||
position to the second entry, and merely use `x' for making the switch.
|
||
|
||
|
||
File: gettext.info, Node: Normalizing, Prev: Entry Positioning, Up: Basics
|
||
|
||
Normalizing Strings in Entries
|
||
==============================
|
||
|
||
There are many different ways for encoding a particular string into a
|
||
PO file entry, because there are so many different ways to split and
|
||
quote multi-line strings, and even, to represent special characters by
|
||
backslahsed escaped sequences. Some features of PO mode rely on the
|
||
ability for PO mode to scan an already existing PO file for a
|
||
particular string encoded into the `msgid' field of some entry. Even
|
||
if PO mode has internally all the built-in machinery for implementing
|
||
this recognition easily, doing it fast is technically difficult. To
|
||
facilitate a solution to this efficiency problem, we decided on a
|
||
canonical representation for strings.
|
||
|
||
A conventional representation of strings in a PO file is currently
|
||
under discussion, and PO mode experiments with a canonical
|
||
representation. Having both `xgettext' and PO mode converging towards
|
||
a uniform way of representing equivalent strings would be useful, as
|
||
the internal normalization needed by PO mode could be automatically
|
||
satisfied when using `xgettext' from GNU `gettext'. An explicit PO
|
||
mode normalization should then be only necessary for PO files imported
|
||
from elsewhere, or for when the convention itself evolves.
|
||
|
||
So, for achieving normalization of at least the strings of a given
|
||
PO file needing a canonical representation, the following PO mode
|
||
command is available:
|
||
|
||
`M-x po-normalize'
|
||
Tidy the whole PO file by making entries more uniform.
|
||
|
||
The special command `M-x po-normalize', which has no associate keys,
|
||
revises all entries, ensuring that strings of both original and
|
||
translated entries use uniform internal quoting in the PO file. It
|
||
also removes any crumb after the last entry. This command may be
|
||
useful for PO files freshly imported from elsewhere, or if we ever
|
||
improve on the canonical quoting format we use. This canonical format
|
||
is not only meant for getting cleaner PO files, but also for greatly
|
||
speeding up `msgid' string lookup for some other PO mode commands.
|
||
|
||
`M-x po-normalize' presently makes three passes over the entries.
|
||
The first implements heuristics for converting PO files for GNU
|
||
`gettext' 0.6 and earlier, in which `msgid' and `msgstr' fields were
|
||
using K&R style C string syntax for multi-line strings. These
|
||
heuristics may fail for comments not related to obsolete entries and
|
||
ending with a backslash; they also depend on subsequent passes for
|
||
finalizing the proper commenting of continued lines for obsolete
|
||
entries. This first pass might disappear once all oldish PO files
|
||
would have been adjusted. The second and third pass normalize all
|
||
`msgid' and `msgstr' strings respectively. They also clean out those
|
||
trailing backslashes used by XView's `msgfmt' for continued lines.
|
||
|
||
Having such an explicit normalizing command allows for importing PO
|
||
files from other sources, but also eases the evolution of the current
|
||
convention, evolution driven mostly by aesthetic concerns, as of now.
|
||
It is easy to make suggested adjustments at a later time, as the
|
||
normalizing command and eventually, other GNU `gettext' tools should
|
||
greatly automate conformance. A description of the canonical string
|
||
format is given below, for the particular benefit of those not having
|
||
GNU Emacs handy, and who would nevertheless want to handcraft their PO
|
||
files in nice ways.
|
||
|
||
Right now, in PO mode, strings are single line or multi-line. A
|
||
string goes multi-line if and only if it has *embedded* newlines, that
|
||
is, if it matches `[^\n]\n+[^\n]'. So, we would have:
|
||
|
||
msgstr "\n\nHello, world!\n\n\n"
|
||
|
||
but, replacing the space by a newline, this becomes:
|
||
|
||
msgstr ""
|
||
"\n"
|
||
"\n"
|
||
"Hello,\n"
|
||
"world!\n"
|
||
"\n"
|
||
"\n"
|
||
|
||
We are deliberately using a caricatural example, here, to make the
|
||
point clearer. Usually, multi-lines are not that bad looking. It is
|
||
probable that we will implement the following suggestion. We might
|
||
lump together all initial newlines into the empty string, and also all
|
||
newlines introducing empty lines (that is, for N > 1, the N-1'th last
|
||
newlines would go together on a separate string), so making the
|
||
previous example appear:
|
||
|
||
msgstr "\n\n"
|
||
"Hello,\n"
|
||
"world!\n"
|
||
"\n\n"
|
||
|
||
There are a few yet undecided little points about string
|
||
normalization, to be documented in this manual, once these questions
|
||
settle.
|
||
|
||
|
||
File: gettext.info, Node: Sources, Next: Initial, Prev: Basics, Up: Top
|
||
|
||
Preparing Program Sources
|
||
*************************
|
||
|
||
For the programmer, changes to the C source code fall into three
|
||
categories. First, you have to make the localization functions known
|
||
to all modules needing message translation. Second, you should
|
||
properly trigger the operation of GNU `gettext' when the program
|
||
initializes, usually from the `main' function. Last, you should
|
||
identify and especially mark all constant strings in your program
|
||
needing translation.
|
||
|
||
Presuming that your set of programs, or package, has been adjusted
|
||
so all needed GNU `gettext' files are available, and your `Makefile'
|
||
files are adjusted (*note Maintainers::.), each C module having
|
||
translated C strings should contain the line:
|
||
|
||
#include <libintl.h>
|
||
|
||
The remaining changes to your C sources are discussed in the further
|
||
sections of this chapter.
|
||
|
||
* Menu:
|
||
|
||
* Triggering:: Triggering `gettext' Operations
|
||
* Mark Keywords:: How Marks Appears in Sources
|
||
* Marking:: Marking Translatable Strings
|
||
* c-format:: Telling something about the following string
|
||
* Special cases:: Special Cases of Translatable Strings
|
||
|
||
|
||
File: gettext.info, Node: Triggering, Next: Mark Keywords, Prev: Sources, Up: Sources
|
||
|
||
Triggering `gettext' Operations
|
||
===============================
|
||
|
||
The initialization of locale data should be done with more or less
|
||
the same code in every program, as demonstrated below:
|
||
|
||
int
|
||
main (argc, argv)
|
||
int argc;
|
||
char argv;
|
||
{
|
||
...
|
||
setlocale (LC_ALL, "");
|
||
bindtextdomain (PACKAGE, LOCALEDIR);
|
||
textdomain (PACKAGE);
|
||
...
|
||
}
|
||
|
||
PACKAGE and LOCALEDIR should be provided either by `config.h' or by
|
||
the Makefile. For now consult the `gettext' sources for more
|
||
information.
|
||
|
||
The use of `LC_ALL' might not be appropriate for you. `LC_ALL'
|
||
includes all locale categories and especially `LC_CTYPE'. This later
|
||
category is responsible for determining character classes with the
|
||
`isalnum' etc. functions from `ctype.h' which could especially for
|
||
programs, which process some kind of input language, be wrong. For
|
||
example this would mean that a source code using the c, (c-cedilla
|
||
character) is runnable in France but not in the U.S.
|
||
|
||
Some systems also have problems with parsing number using the
|
||
`scanf' functions if an other but the `LC_ALL' locale is used. The
|
||
standards say that additional formats but the one known in the `"C"'
|
||
locale might be recognized. But some systems seem to reject numbers in
|
||
the `"C"' locale format. In some situation, it might also be a problem
|
||
with the notation itself which makes it impossible to recognize whether
|
||
the number is in the `"C"' locale or the local format. This can happen
|
||
if thousands separator characters are used. Some locales define this
|
||
character accordfing to the national conventions to `'.'' which is the
|
||
same character used in the `"C"' locale to denote the decimal point.
|
||
|
||
So it is sometimes necessary to replace the `LC_ALL' line in the
|
||
code above by a sequence of `setlocale' lines
|
||
|
||
{
|
||
...
|
||
setlocale (LC_TIME, "");
|
||
setlocale (LC_MESSAGES, "");
|
||
...
|
||
}
|
||
|
||
or to switch for and back to the character class in question. On all
|
||
POSIX conformant systems the locale categories `LC_CTYPE',
|
||
`LC_COLLATE', `LC_MONETARY', `LC_NUMERIC', and `LC_TIME' are available.
|
||
On some modern systems there is also a locale `LC_MESSAGES' which is
|
||
called on some old, XPG2 compliant systems `LC_RESPONSES'.
|
||
|
||
|
||
File: gettext.info, Node: Mark Keywords, Next: Marking, Prev: Triggering, Up: Sources
|
||
|
||
How Marks Appears in Sources
|
||
============================
|
||
|
||
All strings requiring translation should be marked in the C sources.
|
||
Marking is done in such a way that each translatable string appears to
|
||
be the sole argument of some function or preprocessor macro. There are
|
||
only a few such possible functions or macros meant for translation, and
|
||
their names are said to be marking keywords. The marking is attached
|
||
to strings themselves, rather than to what we do with them. This
|
||
approach has more uses. A blatant example is an error message produced
|
||
by formatting. The format string needs translation, as well as some
|
||
strings inserted through some `%s' specification in the format, while
|
||
the result from `sprintf' may have so many different instances that it
|
||
is impractical to list them all in some `error_string_out()' routine,
|
||
say.
|
||
|
||
This marking operation has two goals. The first goal of marking is
|
||
for triggering the retrieval of the translation, at run time. The
|
||
keyword are possibly resolved into a routine able to dynamically return
|
||
the proper translation, as far as possible or wanted, for the argument
|
||
string. Most localizable strings are found in executable positions,
|
||
that is, attached to variables or given as parameters to functions.
|
||
But this is not universal usage, and some translatable strings appear
|
||
in structured initializations. *Note Special cases::.
|
||
|
||
The second goal of the marking operation is to help `xgettext' at
|
||
properly extracting all translatable strings when it scans a set of
|
||
program sources and produces PO file templates.
|
||
|
||
The canonical keyword for marking translatable strings is `gettext',
|
||
it gave its name to the whole GNU `gettext' package. For packages
|
||
making only light use of the `gettext' keyword, macro or function, it
|
||
is easily used *as is*. However, for packages using the `gettext'
|
||
interface more heavily, it is usually more convenient to give the main
|
||
keyword a shorter, less obtrusive name. Indeed, the keyword might
|
||
appear on a lot of strings all over the package, and programmers
|
||
usually do not want nor need their program sources to remind them
|
||
forcefully, all the time, that they are internationalized. Further, a
|
||
long keyword has the disadvantage of using more horizontal space,
|
||
forcing more indentation work on sources for those trying to keep them
|
||
within 79 or 80 columns.
|
||
|
||
Many packages use `_' (a simple underline) as a keyword, and write
|
||
`_("Translatable string")' instead of `gettext ("Translatable
|
||
string")'. Further, the coding rule, from GNU standards, wanting that
|
||
there is a space between the keyword and the opening parenthesis is
|
||
relaxed, in practice, for this particular usage. So, the textual
|
||
overhead per translatable string is reduced to only three characters:
|
||
the underline and the two parentheses. However, even if GNU `gettext'
|
||
uses this convention internally, it does not offer it officially. The
|
||
real, genuine keyword is truly `gettext' indeed. It is fairly easy for
|
||
those wanting to use `_' instead of `gettext' to declare:
|
||
|
||
#include <libintl.h>
|
||
#define _(String) gettext (String)
|
||
|
||
instead of merely using `#include <libintl.h>'.
|
||
|
||
Later on, the maintenance is relatively easy. If, as a programmer,
|
||
you add or modify a string, you will have to ask yourself if the new or
|
||
altered string requires translation, and include it within `_()' if you
|
||
think it should be translated. `"%s: %d"' is an example of string
|
||
*not* requiring translation!
|
||
|
||
|
||
File: gettext.info, Node: Marking, Next: c-format, Prev: Mark Keywords, Up: Sources
|
||
|
||
Marking Translatable Strings
|
||
============================
|
||
|
||
In PO mode, one set of features is meant more for the programmer than
|
||
for the translator, and allows him to interactively mark which strings,
|
||
in a set of program sources, are translatable, and which are not. Even
|
||
if it is a fairly easy job for a programmer to find and mark such
|
||
strings by other means, using any editor of his choice, PO mode makes
|
||
this work more comfortable. Further, this gives translators who feel a
|
||
little like programmers, or programmers who feel a little like
|
||
translators, a tool letting them work at marking translatable strings
|
||
in the program sources, while simultaneously producing a set of
|
||
translation in some language, for the package being internationalized.
|
||
|
||
The set of program sources, targetted by the PO mode commands
|
||
describe here, should have an Emacs tags table constructed for your
|
||
project, prior to using these PO file commands. This is easy to do.
|
||
In any shell window, change the directory to the root of your project,
|
||
then execute a command resembling:
|
||
|
||
etags src/*.[hc] lib/*.[hc]
|
||
|
||
presuming here you want to process all `.h' and `.c' files from the
|
||
`src/' and `lib/' directories. This command will explore all said
|
||
files and create a `TAGS' file in your root directory, somewhat
|
||
summarizing the contents using a special file format Emacs can
|
||
understand.
|
||
|
||
For packages following the GNU coding standards, there is a make
|
||
goal `tags' or `TAGS' which construct the tag files in all directories
|
||
and for all files containing source code.
|
||
|
||
Once your `TAGS' file is ready, the following commands assist the
|
||
programmer at marking translatable strings in his set of sources. But
|
||
these commands are necessarily driven from within a PO file window, and
|
||
it is likely that you do not even have such a PO file yet. This is not
|
||
a problem at all, as you may safely open a new, empty PO file, mainly
|
||
for using these commands. This empty PO file will slowly fill in while
|
||
you mark strings as translatable in your program sources.
|
||
|
||
`,'
|
||
Search through program sources for a string which looks like a
|
||
candidate for translation.
|
||
|
||
`M-,'
|
||
Mark the last string found with `_()'.
|
||
|
||
`M-.'
|
||
Mark the last string found with a keyword taken from a set of
|
||
possible keywords. This command with a prefix allows some
|
||
management of these keywords.
|
||
|
||
The `,' (`po-tags-search') command search for the next occurrence of
|
||
a string which looks like a possible candidate for translation, and
|
||
displays the program source in another Emacs window, positioned in such
|
||
a way that the string is near the top of this other window. If the
|
||
string is too big to fit whole in this window, it is positioned so only
|
||
its end is shown. In any case, the cursor is left in the PO file
|
||
window. If the shown string would be better presented differently in
|
||
different native languages, you may mark it using `M-,' or `M-.'.
|
||
Otherwise, you might rather ignore it and skip to the next string by
|
||
merely repeating the `,' command.
|
||
|
||
A string is a good candidate for translation if it contains a
|
||
sequence of three or more letters. A string containing at most two
|
||
letters in a row will be considered as a candidate if it has more
|
||
letters than non-letters. The command disregards strings containing no
|
||
letters, or isolated letters only. It also disregards strings within
|
||
comments, or strings already marked with some keyword PO mode knows
|
||
(see below).
|
||
|
||
If you have never told Emacs about some `TAGS' file to use, the
|
||
command will request that you specify one from the minibuffer, the
|
||
first time you use the command. You may later change your `TAGS' file
|
||
by using the regular Emacs command `M-x visit-tags-table', which will
|
||
ask you to name the precise `TAGS' file you want to use. *Note Tag
|
||
Tables: (emacs)Tags.
|
||
|
||
Each time you use the `,' command, the search resumes from where it
|
||
was left by the previous search, and goes through all program sources,
|
||
obeying the `TAGS' file, until all sources have been processed.
|
||
However, by giving a prefix argument to the command (`C-u ,'), you may
|
||
request that the search be restarted all over again from the first
|
||
program source; but in this case, strings that you recently marked as
|
||
translatable will be automatically skipped.
|
||
|
||
Using this `,' command does not prevent using of other regular Emacs
|
||
tags commands. For example, regular `tags-search' or
|
||
`tags-query-replace' commands may be used without disrupting the
|
||
independent `,' search sequence. However, as implemented, the
|
||
*initial* `,' command (or the `,' command is used with a prefix) might
|
||
also reinitialize the regular Emacs tags searching to the first tags
|
||
file, this reinitialization might be considered spurious.
|
||
|
||
The `M-,' (`po-mark-translatable') command will mark the recently
|
||
found string with the `_' keyword. The `M-.'
|
||
(`po-select-mark-and-mark') command will request that you type one
|
||
keyword from the minibuffer and use that keyword for marking the
|
||
string. Both commands will automatically create a new PO file
|
||
untranslated entry for the string being marked, and make it the current
|
||
entry (making it easy for you to immediately proceed to its
|
||
translation, if you feel like doing it right away). It is possible
|
||
that the modifications made to the program source by `M-,' or `M-.'
|
||
render some source line longer than 80 columns, forcing you to break
|
||
and re-indent this line differently. You may use the `O' command from
|
||
PO mode, or any other window changing command from GNU Emacs, to break
|
||
out into the program source window, and do any needed adjustments. You
|
||
will have to use some regular Emacs command to return the cursor to the
|
||
PO file window, if you want command `,' for the next string, say.
|
||
|
||
The `M-.' command has a few built-in speedups, so you do not have to
|
||
explicitly type all keywords all the time. The first such speedup is
|
||
that you are presented with a *preferred* keyword, which you may accept
|
||
by merely typing `<RET>' at the prompt. The second speedup is that you
|
||
may type any non-ambiguous prefix of the keyword you really mean, and
|
||
the command will complete it automatically for you. This also means
|
||
that PO mode has to *know* all your possible keywords, and that it will
|
||
not accept mistyped keywords.
|
||
|
||
If you reply `?' to the keyword request, the command gives a list of
|
||
all known keywords, from which you may choose. When the command is
|
||
prefixed by an argument (`C-u M-.'), it inhibits updating any program
|
||
source or PO file buffer, and does some simple keyword management
|
||
instead. In this case, the command asks for a keyword, written in
|
||
full, which becomes a new allowed keyword for later `M-.' commands.
|
||
Moreover, this new keyword automatically becomes the *preferred*
|
||
keyword for later commands. By typing an already known keyword in
|
||
response to `C-u M-.', one merely changes the *preferred* keyword and
|
||
does nothing more.
|
||
|
||
All keywords known for `M-.' are recognized by the `,' command when
|
||
scanning for strings, and strings already marked by any of those known
|
||
keywords are automatically skipped. If many PO files are opened
|
||
simultaneously, each one has its own independent set of known keywords.
|
||
There is no provision in PO mode, currently, for deleting a known
|
||
keyword, you have to quit the file (maybe using `q') and reopen it
|
||
afresh. When a PO file is newly brought up in an Emacs window, only
|
||
`gettext' and `_' are known as keywords, and `gettext' is preferred for
|
||
the `M-.' command. In fact, this is not useful to prefer `_', as this
|
||
one is already built in the `M-,' command.
|
||
|
||
|
||
File: gettext.info, Node: c-format, Next: Special cases, Prev: Marking, Up: Sources
|
||
|
||
Special Comments preceding Keywords
|
||
===================================
|
||
|
||
In C programs strings are often used within calls of functions from
|
||
the `printf' family. The special thing about these format strings is
|
||
that they can contain format specifiers introduced with `%'. Assume we
|
||
have the code
|
||
|
||
printf (gettext ("String `%s' has %d characters\n"), s, strlen (s));
|
||
|
||
A possible German translation for the above string might be:
|
||
|
||
"%d Zeichen lang ist die Zeichenkette `%s'"
|
||
|
||
A C programmer, even if he cannot speak German, will recognize that
|
||
there is something wrong here. The order of the two format specifiers
|
||
is changed but of course the arguments in the `printf' don't have.
|
||
This will most probably lead to problems because now the length of the
|
||
string is regarded as the address.
|
||
|
||
To prevent errors at runtime caused by translations the `msgfmt'
|
||
tool can check statically whether the arguments in the original and the
|
||
translation string match in type and number. If this is not the case a
|
||
warning will be given and the error cannot causes problems at runtime.
|
||
|
||
If the word order in the above German translation would be correct one
|
||
would have to write
|
||
|
||
"%2$d Zeichen lang ist die Zeichenkette `%1$s'"
|
||
|
||
The routines in `msgfmt' know about this special notation.
|
||
|
||
Because not all strings in a program must be format strings it is not
|
||
useful for `msgfmt' to test all the strings in the `.po' file. This
|
||
might cause problems because the string might contain what looks like a
|
||
format specifier, but the string is not used in `printf'.
|
||
|
||
Therefore the `xgettext' adds a special tag to those messages it
|
||
thinks might be a format string. There is no absolute rule for this,
|
||
only a heuristic. In the `.po' file the entry is marked using the
|
||
`c-format' flag in the `#,' comment line (*note PO Files::.).
|
||
|
||
The careful reader now might say that this again can cause problems.
|
||
The heuristic might guess it wrong. This is true and therefore
|
||
`xgettext' knows about special kind of comment which lets the
|
||
programmer take over the decision. If in the same line or the
|
||
immediately preceding line of the `gettext' keyword the `xgettext'
|
||
program find a comment containing the words `xgettext:c-format' it will
|
||
mark the string in any case with the `c-format' flag. This kind of
|
||
comment should be used when `xgettext' does not recognize the string as
|
||
a format string but is really is one and it should be tested. Please
|
||
note that when the comment is in the same line of the `gettext'
|
||
keyword, it must be before the string to be translated.
|
||
|
||
This situation happens quite often. The `printf' function is often
|
||
called with strings which do not contain a format specifier. Of course
|
||
one would normally use `fputs' but it does happen. In this case
|
||
`xgettext' does not recognize this as a format string but what happens
|
||
if the translation introduces a valid format specifier? The `printf'
|
||
function will try to access one of the parameter but none exists
|
||
because the original code does not refer to any parameter.
|
||
|
||
`xgettext' of course could make a wrong decision the other way
|
||
round. A string marked as a format string is not really a format
|
||
string. In this case the `msgfmt' might give too many warnings and
|
||
would prevent translating the `.po' file. The method to prevent this
|
||
wrong decision is similar to the one used above, only the comment to
|
||
use must contain the string `xgettext:no-c-format'.
|
||
|
||
If a string is marked with `c-format' and this is not correct the
|
||
user can find out who is responsible for the decision. *Note xgettext
|
||
Invocation:: to see how the `--debug' option can be used for solving
|
||
this problem.
|
||
|
||
|
||
File: gettext.info, Node: Special cases, Prev: c-format, Up: Sources
|
||
|
||
Special Cases of Translatable Strings
|
||
=====================================
|
||
|
||
The attentive reader might now point out that it is not always
|
||
possible to mark translatable string with `gettext' or something like
|
||
this. Consider the following case:
|
||
|
||
{
|
||
static const char *messages[] = {
|
||
"some very meaningful message",
|
||
"and another one"
|
||
};
|
||
const char *string;
|
||
...
|
||
string
|
||
= index > 1 ? "a default message" : messages[index];
|
||
|
||
fputs (string);
|
||
...
|
||
}
|
||
|
||
While it is no problem to mark the string `"a default message"' it
|
||
is not possible to mark the string initializers for `messages'. What
|
||
is to be done? We have to fulfill two tasks. First we have to mark the
|
||
strings so that the `xgettext' program (*note xgettext Invocation::.)
|
||
can find them, and second we have to translate the string at runtime
|
||
before printing them.
|
||
|
||
The first task can be fulfilled by creating a new keyword, which
|
||
names a no-op. For the second we have to mark all access points to a
|
||
string from the array. So one solution can look like this:
|
||
|
||
#define gettext_noop(String) (String)
|
||
|
||
{
|
||
static const char *messages[] = {
|
||
gettext_noop ("some very meaningful message"),
|
||
gettext_noop ("and another one")
|
||
};
|
||
const char *string;
|
||
...
|
||
string
|
||
= index > 1 ? gettext ("a default message") : gettext (messages[index]);
|
||
|
||
fputs (string);
|
||
...
|
||
}
|
||
|
||
Please convince yourself that the string which is written by `fputs'
|
||
is translated in any case. How to get `xgettext' know the additional
|
||
keyword `gettext_noop' is explained in *Note xgettext Invocation::.
|
||
|
||
The above is of course not the only solution. You could also come
|
||
along with the following one:
|
||
|
||
#define gettext_noop(String) (String)
|
||
|
||
{
|
||
static const char *messages[] = {
|
||
gettext_noop ("some very meaningful message",
|
||
gettext_noop ("and another one")
|
||
};
|
||
const char *string;
|
||
...
|
||
string
|
||
= index > 1 ? gettext_noop ("a default message") : messages[index];
|
||
|
||
fputs (gettext (string));
|
||
...
|
||
}
|
||
|
||
But this has some drawbacks. First the programmer has to take care
|
||
that he uses `gettext_noop' for the string `"a default message"'. A
|
||
use of `gettext' could have in rare cases unpredictable results. The
|
||
second reason is found in the internals of the GNU `gettext' Library
|
||
which will make this solution less efficient.
|
||
|
||
One advantage is that you need not make control flow analysis to make
|
||
sure the output is really translated in any case. But this analysis is
|
||
generally not very difficult. If it should be in any situation you can
|
||
use this second method in this situation.
|
||
|
||
|
||
File: gettext.info, Node: Initial, Next: Updating, Prev: Sources, Up: Top
|
||
|
||
Making the Initial PO File
|
||
**************************
|
||
|
||
* Menu:
|
||
|
||
* xgettext Invocation:: Invoking the `xgettext' Program
|
||
* C Sources Context:: C Sources Context
|
||
* Compendium:: Using Translation Compendiums
|
||
|
||
|
||
File: gettext.info, Node: xgettext Invocation, Next: C Sources Context, Prev: Initial, Up: Initial
|
||
|
||
Invoking the `xgettext' Program
|
||
===============================
|
||
|
||
xgettext [OPTION] INPUTFILE ...
|
||
|
||
`-a'
|
||
`--extract-all'
|
||
Extract all strings.
|
||
|
||
`-c [TAG]'
|
||
`--add-comments[=TAG]'
|
||
Place comment block with TAG (or those preceding keyword lines) in
|
||
output file.
|
||
|
||
`-C'
|
||
`--c++'
|
||
Recognize C++ style comments.
|
||
|
||
`--debug'
|
||
Use the flags `c-format' and `possible-c-format' to show who was
|
||
responsible for marking a message as a format string. The later
|
||
form is used if the `xgettext' program decided, the format form is
|
||
used if the programmer prescribed it.
|
||
|
||
By default only the `c-format' form is used. The translator should
|
||
not have to care about these details.
|
||
|
||
`-d NAME'
|
||
`--default-domain=NAME'
|
||
Use `NAME.po' for output (instead of `messages.po').
|
||
|
||
The special domain name `-' or `/dev/stdout' means to write the
|
||
output to `stdout'.
|
||
|
||
`-D DIRECTORY'
|
||
`--directory=DIRECTORY'
|
||
Change to DIRECTORY before beginning to search and scan source
|
||
files. The resulting `.po' file will be written relative to the
|
||
original directory, though.
|
||
|
||
`-f FILE'
|
||
`--files-from=FILE'
|
||
Read the names of the input files from FILE instead of getting
|
||
them from the command line.
|
||
|
||
`--force'
|
||
Always write output file even if no message is defined.
|
||
|
||
`-h'
|
||
`--help'
|
||
Display this help and exit.
|
||
|
||
`-I LIST'
|
||
`--input-path=LIST'
|
||
List of directories searched for input files.
|
||
|
||
`-j'
|
||
`--join-existing'
|
||
Join messages with existing file.
|
||
|
||
`-k WORD'
|
||
`--keyword[=WORD]'
|
||
Additonal keyword to be looked for (without WORD means not to use
|
||
default keywords).
|
||
|
||
The default keywords, which are always looked for if not explicitly
|
||
disabled, are `gettext', `dgettext', `dcgettext' and
|
||
`gettext_noop'.
|
||
|
||
`-m [STRING]'
|
||
`--msgstr-prefix[=STRING]'
|
||
Use STRING or "" as prefix for msgstr entries.
|
||
|
||
`-M [STRING]'
|
||
`--msgstr-suffix[=STRING]'
|
||
Use STRING or "" as suffix for msgstr entries.
|
||
|
||
`--no-location'
|
||
Do not write `#: FILENAME:LINE' lines.
|
||
|
||
`-n'
|
||
`--add-location'
|
||
Generate `#: FILENAME:LINE' lines (default).
|
||
|
||
`--omit-header'
|
||
Don't write header with `msgid ""' entry.
|
||
|
||
This is useful for testing purposes because it eliminates a source
|
||
of variance for generated `.gmo' files. We can ship some of these
|
||
files in the GNU `gettext' package, and the result of regenerating
|
||
them through `msgfmt' should yield the same values.
|
||
|
||
`-p DIR'
|
||
`--output-dir=DIR'
|
||
Output files will be placed in directory DIR.
|
||
|
||
`-s'
|
||
`--sort-output'
|
||
Generate sorted output and remove duplicates.
|
||
|
||
`--strict'
|
||
Write out strict Uniforum conforming PO file.
|
||
|
||
`-v'
|
||
`--version'
|
||
Output version information and exit.
|
||
|
||
`-x FILE'
|
||
`--exclude-file=FILE'
|
||
Entries from FILE are not extracted.
|
||
|
||
Search path for supplementary PO files is:
|
||
`/usr/local/share/nls/src/'.
|
||
|
||
If INPUTFILE is `-', standard input is read.
|
||
|
||
This implementation of `xgettext' is able to process a few awkward
|
||
cases, like strings in preprocessor macros, ANSI concatenation of
|
||
adjacent strings, and escaped end of lines for continued strings.
|
||
|
||
|
||
File: gettext.info, Node: C Sources Context, Next: Compendium, Prev: xgettext Invocation, Up: Initial
|
||
|
||
C Sources Context
|
||
=================
|
||
|
||
PO mode is particularily powerful when used with PO files created
|
||
through GNU `gettext' utilities, as those utilities insert special
|
||
comments in the PO files they generate. Some of these special comments
|
||
relate the PO file entry to exactly where the untranslated string
|
||
appears in the program sources.
|
||
|
||
When the translator gets to an untranslated entry, she is fairly
|
||
often faced with an original string which is not as informative as it
|
||
normally should be, being succinct, cryptic, or otherwise ambiguous.
|
||
Before chosing how to translate the string, she needs to understand
|
||
better what the string really means and how tight the translation has
|
||
to be. Most of times, when problems arise, the only way left to make
|
||
her judgment is looking at the true program sources from where this
|
||
string originated, searching for surrounding comments the programmer
|
||
might have put in there, and looking around for helping clues of *any*
|
||
kind.
|
||
|
||
Surely, when looking at program sources, the translator will receive
|
||
more help if she is a fluent programmer. However, even if she is not
|
||
versed in programming and feels a little lost in C code, the translator
|
||
should not be shy at taking a look, once in a while. It is most
|
||
probable that she will still be able to find some of the hints she
|
||
needs. She will learn quickly to not feel uncomfortable in program
|
||
code, paying more attention to programmer's comments, variable and
|
||
function names (if he dared chosing them well), and overall
|
||
organization, than to programmation itself.
|
||
|
||
The following commands are meant to help the translator at getting
|
||
program source context for a PO file entry.
|
||
|
||
`s'
|
||
Resume the display of a program source context, or cycle through
|
||
them.
|
||
|
||
`M-s'
|
||
Display of a program source context selected by menu.
|
||
|
||
`S'
|
||
Add a directory to the search path for source files.
|
||
|
||
`M-S'
|
||
Delete a directory from the search path for source files.
|
||
|
||
The commands `s' (`po-cycle-reference') and `M-s'
|
||
(`po-select-source-reference') both open another window displaying some
|
||
source program file, and already positioned in such a way that it shows
|
||
an actual use of the string to be translated. By doing so, the command
|
||
gives source program context for the string. But if the entry has no
|
||
source context references, or if all references are unresolved along
|
||
the search path for program sources, then the command diagnoses this as
|
||
an error.
|
||
|
||
Even if `s' (or `M-s') opens a new window, the cursor stays in the
|
||
PO file window. If the translator really wants to get into the program
|
||
source window, she ought to do it explicitly, maybe by using command
|
||
`O'.
|
||
|
||
When `s' is typed for the first time, or for a PO file entry which
|
||
is different of the last one used for getting source context, then the
|
||
command reacts by giving the first context available for this entry, if
|
||
any. If some context has already been recently displayed for the
|
||
current PO file entry, and the translator wandered off to do other
|
||
things, typing `s' again will merely resume, in another window, the
|
||
context last displayed. In particular, if the translator moved the
|
||
cursor away from the context in the source file, the command will bring
|
||
the cursor back to the context. By using `s' many times in a row, with
|
||
no other commands intervening, PO mode will cycle to the next available
|
||
contexts for this particular entry, getting back to the first context
|
||
once the last has been shown.
|
||
|
||
The command `M-s' behaves differently. Instead of cycling through
|
||
references, it lets the translator choose of particular reference among
|
||
many, and displays that reference. It is best used with completion, if
|
||
the translator types `TAB' immediately after `M-s', in response to the
|
||
question, she will be offered a menu of all possible references, as a
|
||
reminder of which are the acceptable answers. This command is useful
|
||
only where there are really many contexts available for a single string
|
||
to translate.
|
||
|
||
Program source files are usually found relative to where the PO file
|
||
stands. As a special provision, when this fails, the file is also
|
||
looked for, but relative to the directory immediately above it. Those
|
||
two cases take proper care of most PO files. However, it might happen
|
||
that a PO file has been moved, or is edited in a different place than
|
||
its normal location. When this happens, the translator should tell PO
|
||
mode in which directory normally sits the genuine PO file. Many such
|
||
directories may be specified, and all together, they constitute what is
|
||
called the "search path" for program sources. The command `S'
|
||
(`po-consider-source-path') is used to interactively enter a new
|
||
directory at the front of the search path, and the command `M-S'
|
||
(`po-ignore-source-path') is used to select, with completion, one of
|
||
the directories she does not want anymore on the search path.
|
||
|
||
|
||
File: gettext.info, Node: Compendium, Prev: C Sources Context, Up: Initial
|
||
|
||
Using Translation Compendiums
|
||
=============================
|
||
|
||
Compendiums are yet to be implemented.
|
||
|
||
An incoming PO mode feature will let the translator maintain a
|
||
compendium of already achieved translations. A "compendium" is a
|
||
special PO file containing a set of translations recurring in many
|
||
different packages. The translator will be given commands for adding
|
||
entries to her compendium, and later initializing untranslated entries,
|
||
or updating already translated entries, from translations kept in the
|
||
compendium. For this to work, however, the compendium would have to be
|
||
normalized. *Note Normalizing::.
|
||
|
||
|
||
File: gettext.info, Node: Updating, Next: Binaries, Prev: Initial, Up: Top
|
||
|
||
Updating Existing PO Files
|
||
**************************
|
||
|
||
* Menu:
|
||
|
||
* msgmerge Invocation:: Invoking the `msgmerge' Program
|
||
* Translated Entries::
|
||
* Fuzzy Entries::
|
||
* Untranslated Entries:: Untranslated Entries
|
||
* Obsolete Entries:: Obsolete Entries
|
||
* Modifying Translations:: Modifying Translations
|
||
* Modifying Comments:: Modifying Comments
|
||
* Auxiliary:: Consulting Auxiliary PO Files
|
||
|
||
|
||
File: gettext.info, Node: msgmerge Invocation, Next: Translated Entries, Prev: Updating, Up: Updating
|
||
|
||
Invoking the `msgmerge' Program
|
||
===============================
|
||
|
||
|
||
File: gettext.info, Node: Translated Entries, Next: Fuzzy Entries, Prev: msgmerge Invocation, Up: Updating
|
||
|
||
Translated Entries
|
||
==================
|
||
|
||
Each PO file entry for which the `msgstr' field has been filled with
|
||
a translation, and which is not marked as fuzzy (*note Fuzzy
|
||
Entries::.), is a said to be a "translated" entry. Only translated
|
||
entries will later be compiled by GNU `msgfmt' and become usable in
|
||
programs. Other entry types will be excluded; translation will not
|
||
occur for them.
|
||
|
||
Some commands are more specifically related to translated entry
|
||
processing.
|
||
|
||
`t'
|
||
Find the next translated entry.
|
||
|
||
`M-t'
|
||
Find the previous translated entry.
|
||
|
||
The commands `t' (`po-next-translated-entry') and `M-t'
|
||
(`po-previous-transted-entry') move forwards or backwards, chasing for
|
||
an translated entry. If none is found, the search is extended and
|
||
wraps around in the PO file buffer.
|
||
|
||
Translated entries usually result from the translator having edited
|
||
in a translation for them, *Note Modifying Translations::. However, if
|
||
the variable `po-auto-fuzzy-on-edit' is not `nil', the entry having
|
||
received a new translation first becomes a fuzzy entry, which ought to
|
||
be later unfuzzied before becoming an official, genuine translated
|
||
entry. *Note Fuzzy Entries::.
|
||
|
||
|
||
File: gettext.info, Node: Fuzzy Entries, Next: Untranslated Entries, Prev: Translated Entries, Up: Updating
|
||
|
||
Fuzzy Entries
|
||
=============
|
||
|
||
Each PO file entry may have a set of "attributes", which are
|
||
qualities given an name and explicitely associated with the entry
|
||
translation, using a special system comment. One of these attributes
|
||
has the name `fuzzy', and entries having this attribute are said to
|
||
have a fuzzy translation. They are called fuzzy entries, for short.
|
||
|
||
Fuzzy entries, even if they account for translated entries for most
|
||
other purposes, usually call for revision by the translator. Those may
|
||
be produced by applying the program `msgmerge' to update an older
|
||
translated PO files according to a new PO template file, when this tool
|
||
hypothesises that some new `msgid' has been modified only slightly out
|
||
of an older one, and chooses to pair what it thinks to be the old
|
||
translation for the new modified entry. The slight alteration in the
|
||
original string (the `msgid' string) should often be reflected in the
|
||
translated string, and this requires the intervention of the
|
||
translator. For this reason, `msgmerge' might mark some entries as
|
||
being fuzzy.
|
||
|
||
Also, the translator may decide herself to mark an entry as fuzzy
|
||
for her own convenience, when she wants to remember that the entry has
|
||
to be later revisited. So, some commands are more specifically related
|
||
to fuzzy entry processing.
|
||
|
||
`f'
|
||
Find the next fuzzy entry.
|
||
|
||
`M-f'
|
||
Find the previous fuzzy entry.
|
||
|
||
`TAB'
|
||
Remove the fuzzy attribute of the current entry.
|
||
|
||
The commands `f' (`po-next-fuzzy') and `M-f' (`po-previous-fuzzy')
|
||
move forwards or backwards, chasing for a fuzzy entry. If none is
|
||
found, the search is extended and wraps around in the PO file buffer.
|
||
|
||
The command `TAB' (`po-unfuzzy') removes the fuzzy attribute
|
||
associated with an entry, usually leaving it translated. Further, if
|
||
the variable `po-auto-select-on-unfuzzy' has not the `nil' value, the
|
||
`TAB' command will automatically chase for another interesting entry to
|
||
work on. The initial value of `po-auto-select-on-unfuzzy' is `nil'.
|
||
|
||
The initial value of `po-auto-fuzzy-on-edit' is `nil'. However, if
|
||
the variable `po-auto-fuzzy-on-edit' is set to `t', any entry edited
|
||
through the `RET' command is marked fuzzy, as a way to ensure some kind
|
||
of double check, later. In this case, the usual paradigm is that an
|
||
entry becomes fuzzy (if not already) whenever the translator modifies
|
||
it. If she is satisfied with the translation, she then uses `TAB' to
|
||
pick another entry to work on, clearing the fuzzy attribute on the same
|
||
blow. If she is not satisfied yet, she merely uses `SPC' to chase
|
||
another entry, leaving the entry fuzzy.
|
||
|
||
The translator may also use the `DEL' command (`po-fade-out-entry')
|
||
over any translated entry to mark it as being fuzzy, when she wants to
|
||
easily leave a trace she wants to later return working at this entry.
|
||
|
||
Also, when time comes to quit working on a PO file buffer with the
|
||
`q' command, the translator is asked for confirmation, if fuzzy string
|
||
still exists.
|
||
|
||
|
||
File: gettext.info, Node: Untranslated Entries, Next: Obsolete Entries, Prev: Fuzzy Entries, Up: Updating
|
||
|
||
Untranslated Entries
|
||
====================
|
||
|
||
When `xgettext' originally creates a PO file, unless told otherwise,
|
||
it initializes the `msgid' field with the untranslated string, and
|
||
leaves the `msgstr' string to be empty. Such entries, having an empty
|
||
translation, are said to be "untranslated" entries. Later, when the
|
||
programmer slightly modifies some string right in the program, this
|
||
change is later reflected in the PO file by the appearance of a new
|
||
untranslated entry for the modified string.
|
||
|
||
The usual commands moving from entry to entry consider untranslated
|
||
entries on the same level as active entries. Untranslated entries are
|
||
easily recognizable by the fact they end with `msgstr ""'.
|
||
|
||
The work of the translator might be (quite naively) seen as the
|
||
process of seeking after an untranslated entry, editing a translation
|
||
for it, and repeating these actions until no untranslated entries
|
||
remain. Some commands are more specifically related to untranslated
|
||
entry processing.
|
||
|
||
`u'
|
||
Find the next untranslated entry.
|
||
|
||
`M-u'
|
||
Find the previous untranslated entry.
|
||
|
||
`k'
|
||
Turn the current entry into an untranslated one.
|
||
|
||
The commands `u' (`po-next-untranslated-entry') and `M-u'
|
||
(`po-previous-untransted-entry') move forwards or backwards, chasing
|
||
for an untranslated entry. If none is found, the search is extended
|
||
and wraps around in the PO file buffer.
|
||
|
||
An entry can be turned back into an untranslated entry by merely
|
||
emptying its translation, using the command `k' (`po-kill-msgstr').
|
||
*Note Modifying Translations::.
|
||
|
||
Also, when time comes to quit working on a PO file buffer with the
|
||
`q' command, the translator is asked for confirmation, if some
|
||
untranslated string still exists.
|
||
|