Museum

Home

Lab Overview

Retrotechnology Articles

Online Manuals

⇒ catexstr(1) — DG/UX 5.4.2A

Media Vault

Software Library

Restoration Projects

Artifacts Sought

Related Articles

catgets(1)

gencat(1)

catopen(3C)

catclose(3C)

catgets(3C)

printf(3S)

setlocale(3C)

environ(5)

exstr(1)



catexstr(1)                      DG/UX 5.4.2                     catexstr(1)


NAME
       catexstr - extract strings from source files, replace with catgets
       calls.

SYNOPSIS
       catexstr [-llang] [-ccat] [-bbeg] [-eend] file ... > strings
       catexstr -r [-llang] [-ccat] [-bbeg] [-eend] file < strings >
       file.new

DESCRIPTION
       The catexstr utility is used to extract strings from source files and
       replace them with calls to the X-Open-style message retrieval
       function or command (see catgets(1,3C)), and generate a message
       catalog (.msg file) that contains the messages.  The .msg file can
       then be translated into other natural languages.  The source files
       may contain C language source, or source code in other languages,
       such as shell scripts.

       Catexstr has the following options:

       -r      Runs pass two of catexstr (replace mode), generating a new
               version of the source file on the standard output, and
               simultaneously generating a message catalog (.msg file).

       -llang  Specifies the source code language of the file(s) being
               manipulated.  The choices that are recognized are c, sh
               (shell script), and gen (generic).  The -l option establishes
               values to be used as the format of the string and the name of
               the catalog to be inserted into the new source file, and the
               strings that will be recognized as the beginning and end of
               comments.  These may be overridden with the other options
               listed here.

       -ffmt   Specifies the format string to be used when creating the
               modified version of the source code file.  The default
               formats for various languages are shown below.

       -ccat   Specifies the catalog name used when creating the modified
               version of the source code file.  This name is inserted into
               the source code file; it is not used as the name of the .msg
               file to be created.

       -bbeg   Specifies the string to be treated as the beginning of a
               comment.

       -eend   Specifies the string to be treated as the end of a comment.
               This may be one or two bytes long.  Nesting of comments is
               not recognized.

       If none of -l, -f, -c, -b, or -e are specified, then -lc is assumed
       (for compatibility with earlier versions of catexstr).  If a source
       code language is specified with -l, then the default values
       associated with that language (shown below) are assumed.  These
       defaults may be overridden with the other options described above.



Licensed material--property of copyright holder(s)                         1




catexstr(1)                      DG/UX 5.4.2                     catexstr(1)


       If -l is not used, but one of -f, -c, -b, or -e are, then -lgen is
       assumed.  The default values for each of the supported languages are:

          Lang         Format string         catalog   comment   comment
                                               name     begin      end
          ---------------------------------------------------------------
          c      catgets(%s, %d, %d, "%s")   catd      /*        */
          sh     `catgets %s %d %d "%s"`     *         #         \n
          gen    catgets %s %d %d "%s"       *         none      \n

       The parameters passed to sprintf in conjunction with the format
       strings are, respectively:
           the catalog name, as specified here or with the -c option;
           the message set number;
           the message number; and
           the message text.

       * For languages sh and gen, the default catalog name is the name of
       the source file (with any existing extension stripped off), and .cat
       appended.

       In pass one (without the -r option), catexstr extracts a list of
       strings from the named source files, with positional information.
       This list is produced on standard output in the following format:

                   file:line:position:length:setnum:msgnum:"string"

                   file      the name of the source file
                   line      line number in the file
                   position  character position in the line
                   length    length of the original string
                   setnum    null
                   msgnum    null
                   string    the extracted, modified text string, surrounded
                             by double quotes.

       Normally you would redirect this output into a file (the "message
       list file", shown as strings on the command line above).  Then you
       would edit this file as described below.  Then you would use catexstr
       -r to generate a new version of the source file, and a message (.msg)
       file.

       Any '%' characters in the source file that are not part of a "%%"
       pair will be translated into "%nn$" sequences in the message list
       file, where the "nn" numbers enumerate the uses of '%' in the
       message.  For example, the message
           "File %s has %d blocks."
       would become
           "File %1$s has %2$d blocks."
       This allows the human translator to modify the order of the '%'
       tokens in the message to accommodate the syntax requirements of the
       target natural language, while still accommodating the order of the
       parameters to the printf call.  If the message has only one
       occurrence of '%', then this modification is not really necessary,



Licensed material--property of copyright holder(s)                         2




catexstr(1)                      DG/UX 5.4.2                     catexstr(1)


       but it is done anyway.

       Next, examine this list and determine which messages can be
       translated and subsequently retrieved by catgets.  Modify this
       message list file by deleting lines that can't be translated.  In
       particular, text associated with '#include "filename"' lines must be
       deleted, and '#define foo "bar"' lines must be scrutinized.

       If you wish to specify the set number(s) and message number(s) to use
       (see gencat(1)), you may do so by inserting these numbers into the
       fifth (setnum) and sixth (msgnum) fields in the message list file.
       If you do not specify the set number to use for a particular message,
       set number one is used, unless some other set has been specified for
       an earlier message, in which case that set number is used.  If you do
       not specify any message numbers, the messages are numbered
       sequentially, starting with number one.  If any message is explicitly
       numbered, that number is used for that message, and automatic
       numbering resumes from that number.

       You are free to modify the text of the message in the message list
       file in any other way that you consider appropriate.  For example,
       you might use this occasion to clarify an ambiguous English sentence.
       Make sure that the text is enclosed in double quotes (").  Do not
       modify any of the first four fields on these lines, even if you
       change the length of the message.

       The message list file should not be translated into any other natural
       language.  The file to translate into other languages is the message
       file (.msg file) that will be produced by the second pass of
       catexstr.

       Note, however that you must not make any modifications to the source
       file between running the first and second passes of catexstr.

       After editing the message list file, use this modified message list
       file as input to catexstr -r file.  You should provide the same set
       of options (except -r) to this second pass of catexstr that you gave
       to the first pass.  The second pass of catexstr will produce a new
       version of the original source file, in which the messages have been
       replaced by calls to the message retrieval function or command
       catgets.  At the same time, a message file that is of the correct
       format to be used as input to gencat is generated, with the name
       file.msg.

       If you are manipulating C source code, then once the new version of
       the .c file has been created, you must edit it to include a
       declaration for the catalog descriptor variable (normally catd) as
       type nlcatd.  This variable is used in the calls to catgets (see
       catgets(3C)).  Usually, you would declare one catd variable and use
       it throughout the program.  Also, you must add a call to catopen.
       Generally this is at the top of the main routine (see catopen(3C)).
       You may also wish to add a call to catclose.  The program must also
       call setlocale (see setlocale(3C)) if it does not do so already.
       This will probably entail inclusion of locale.h.



Licensed material--property of copyright holder(s)                         3




catexstr(1)                      DG/UX 5.4.2                     catexstr(1)


       The catexstr program cannot correctly replace strings in all
       instances.  For example, a static character string initialization
       cannot be replaced by a call to catexstr.  A second example is an
       escape sequence which should not be translated.  In some cases the C
       code may require modification so that strings can be extracted and
       replaced by calls to the message retrieval function.

   Shell Scripts
       Shell scripts present a variety of challenges.  Here are a few
       pointers in dealing with them.

       Before running the first pass of catexstr, examine the shell script
       for back-quote (`) characters within double-quoted strings (strings
       enclosed in double-quote marks (")).  Such occurrences will not be
       handled correctly by catexstr, and must be modified either before or
       after running catexstr.

       Also look for strings that should be translated, that are not
       enclosed in double quotes.  This includes strings enclosed in single
       quotes (').

       Similarly, look for strings that must be passed as a single argument
       to a command, rather than being broken into separate arguments
       (words) by the shell.  Such cases can be handled by assigning the
       value of the string to a temporary shell variable, and then using the
       shell variable in the call to the command.  For example,
           log_error "This must be one argument, not seven."
       becomes
           msg = "This must be one argument, not seven."
           log_error "$msg"
       which ends up looking something like:
           msg = `catexstr mycat.cat 1 15 \
               "This must be one argument, not seven."`
           log_error "$msg"

       After running the first pass of catexstr, search the message list
       file for any occurrence of a back-quote character.  Any such
       occurrence, as mentioned above, must be changed.  This may be done by
       either modifying the original source and re-running the first pass of
       catexstr, or by modifying the new source file after running the
       second pass of catexstr.

       After running both passes of catexstr, edit the new source file and
       examine each call to catgets, to make sure that it makes sense.  One
       particular optimization that can frequently be made is, for example,
       to change
           echo `catgets mycat.cat 1 16 "Hello, world."`
       to
           catgets mycat.cat 1 16 "Hello, world."

EXAMPLES
       The following examples show uses of catexstr to convert a C program.

       Assume that the file hw.c contains:



Licensed material--property of copyright holder(s)                         4




catexstr(1)                      DG/UX 5.4.2                     catexstr(1)


              main()
              {
                  printf("This is an example\n");
                  printf("Hello world!\n");
                  printf("This is the %s string (number %d)\n", "third", 3);
              }

       catexstr hw.c > hw.strings produces the following output in the file
       hw.strings:

              hw.c:3:8:20:::"This is an example\n"
              hw.c:4:8:14:::"Hello world!\n"
              hw.c:5:8:35:::"This is the %1$s string (number %2$d)\n"
              hw.c:5:47:5:::"third"

       The file hw.strings can be edited as described above.

       The catexstr utility can now be invoked with the -r option to replace
       the strings in the source file by calls to the message retrieval
       function catgets().

       catexstr -r hw.c <hw.strings >hw.new.c produces the following output
       (the indentation has been modified to fit on this manual page):

       #include <nltypes.h>
       main()
       {
       printf(catgets(catd, 1, 1, "This is an example\n"));
       printf(catgets(catd, 1, 2, "Hello world!\n"));
       printf(catgets(catd, 1, 3, "This is the %1$s string (number %2$d)\n"), \
       catgets(catd, 1, 4, "third"), 3);
       }

       This new source file must be edited to include a declaration of catd
       (as type nl_catd), a call to catopen, and possibly calls to setlocale
       and catclose.  You may also wish to break the long line:

       #include <nltypes.h>
       #include <locale.h>
       static nlcatd catd;

       main()
       {
           (void) setlocale (LCALL, "");
           catd = catopen ("hw.cat", 0);
           printf(catgets(catd, 1, 1, "This is an example\n"));
           printf(catgets(catd, 1, 2, "Hello world!\n"));
           printf(catgets(catd, 1, 3, "This is the %1$s string (number %2$d)\n"),
                  catgets(catd, 1, 4, "third"), 3);
           catclose (catd);
       }

       The catexstr -r command above also produces a message file, hw.msg:




Licensed material--property of copyright holder(s)                         5




catexstr(1)                      DG/UX 5.4.2                     catexstr(1)


              $quote "
              $set 1
              1   "This is an example\n"
              2   "Hello world!\n"
              3   "This is the %1$s string (number %2$d)\n"
              4   "third"

       This message file may be replicated and translated into other natural
       languages.

       The following command is used to compile the message catalog:

              rm hw.cat; gencat hw.cat hw.msg

       The resulting message catalog (hw.cat) must be installed in the
       appropriate directory.  Normally, this would be a subdirectory of
       /usr/lib/nls/msg.

   Multiple Source Files
       Programs that consist of more than one source file should be handled
       as follows.  First, catexstr is called with all the source files as
       arguments:

              catexstr foo1.c foo2.c > foo.strings

       Second, the message list file (foo.strings) is edited as described
       above.

       Third, catexstr -r is called once for each source file, to create new
       source files and message (.msg) files:

              catexstr -r foo1.c < foo.strings > foo1.new.c
              catexstr -r foo2.c < foo.strings > foo2.new.c

       Fourth, gencat is called to compile the message catalog:

              rm -f foo.cat
              gencat foo.cat foo1.msg foo2.msg

FILES
       /usr/lib/nls/msg/locale/catalog.cat
                                   files created by gencat(1)

ENVIRONMENT VARIABLES
       NLSPATH   specification of directory containing the locale-specific
                 message catalog directories.

       LANG      locale name.

DIAGNOSTICS
       The error messages produced by catexstr are intended to be
       self-explanatory.  They indicate errors in the command line or format
       errors encountered within the input file.




Licensed material--property of copyright holder(s)                         6




catexstr(1)                      DG/UX 5.4.2                     catexstr(1)


SEE ALSO
       catgets(1), gencat(1),
       catopen(3C), catclose(3C), catgets(3C), printf(3S), setlocale(3C).
       environ(5).
       exstr(1) -- AT&T-style message facility.




















































Licensed material--property of copyright holder(s)                         7


Typewritten Software • bear@typewritten.org • Edmonds, WA 98026