Museum

Home

Lab Overview

Retrotechnology Articles

Online Manuals

⇒ awk(1) — BSD/386 1.0

Media Vault

Software Library

Restoration Projects

Artifacts Sought

Related Articles

egrep(1)



AWK(1)                   Utility Commands                  AWK(1)


NAME
       awk - GNU awk pattern scanning and processing language

SYNOPSIS
       awk [ -W gawk-options ] [ -Ffs ] [ -v var=val ] -f pro-
       gram-file [ -- ] file ...
       awk [ -W gawk-options ] [ -Ffs ] [ -v var=val ] [ -- ]
       program-text file ...

DESCRIPTION
       Gawk  is  the GNU Project's implementation of the AWK pro-
       gramming language.  It conforms to the definition  of  the
       language  in  the POSIX 1003.2 Command Language And Utili-
       ties Standard (draft 11).  This version in turn  is  based
       on  the  description  in  The AWK Programming Language, by
       Aho, Kernighan, and Weinberger, with the  additional  fea-
       tures  defined  in  the System V Release 4 version of UNIX
       awk.  Gawk also provides some GNU-specific extensions.

       The command line consists of options to gawk  itself,  the
       AWK  program text (if not supplied via the -f option), and
       values to be made available in  the  ARGC  and  ARGV  pre-
       defined AWK variables.

OPTIONS
       Gawk accepts the following options, which should be avail-
       able on any implementation of the AWK language.

       -Ffs   Use fs for the input field separator (the value  of
              the FS predefined variable).

       -v var=val
              Assign  the  value val, to the variable var, before
              execution of the  program  begins.   Such  variable
              values  are  available to the BEGIN block of an AWK
              program.

       -f program-file
              Read the AWK program source from the file  program-
              file,  instead of from the first command line argu-
              ment.  Multiple -f options may be used.

       --     Signal the end of options. This is useful to  allow
              further  arguments  to  the  AWK  program itself to
              start with a ``-''.  This is mainly for consistency
              with  the  argument parsing convention used by most
              other POSIX programs.

       Following the POSIX standard,  gawk-specific  options  are
       supplied  via  arguments  to  the  -W option.  Multiple -W
       options may be supplied, or multiple arguments may be sup-
       plied  together  if  they  are  separated  by  commas,  or
       enclosed in quotes and separated by white space.  Case  is
       ignored in arguments to the -W option.



Free Software Foundation    Jun 5 1991                          1




AWK(1)                   Utility Commands                  AWK(1)


       The -W option accepts the following arguments:

       compat    Run  in  compatibility  mode.   In compatibility
                 mode, gawk behaves identically to UNIX awk; none
                 of the GNU-specific extensions are recognized.

       copyleft
       copyright Print  the  short  version  of the GNU copyright
                 information message on the error output.

       lint      Provide warnings about constructs that are dubi-
                 ous  or  non-portable  to  other AWK implementa-
                 tions.

       posix     This turns on compatibility mode, with the  fol-
                 lowing additional restrictions:

                 o \x escape sequences are not recognized.

                 o The  synonym  func for the keyword function is
                   not recognized.

                 o The operators ** and **=  cannot  be  used  in
                   place of ^ and ^=.

       version   Print  version  information  for this particular
                 copy of gawk on the error output.  This is  use-
                 ful  mainly  for  knowing if the current copy of
                 gawk on your system is up to date  with  respect
                 to whatever the Free Software Foundation is dis-
                 tributing.

       Any other options are flagged as illegal, but  are  other-
       wise ignored.

AWK PROGRAM EXECUTION
       An  AWK  program  consists of a sequence of pattern-action
       statements and optional function definitions.

              pattern   { action statements }
              function name(parameter list) { statements }

       Gawk first reads the  program  source  from  the  program-
       file(s)  if  specified, or from the first non-option argu-
       ment on the command line.  The -f option may be used  mul-
       tiple  times on the command line.  Gawk will read the pro-
       gram text as if all the program-files  had  been  concate-
       nated  together.  This is useful for building libraries of
       AWK functions, without having to include them in each  new
       AWK  program that uses them.  To use a library function in
       a file from a program typed in on the command line,  spec-
       ify  /dev/tty  as one of the program-files, type your pro-
       gram, and end it with a ^D (control-d).




Free Software Foundation    Jun 5 1991                          2




AWK(1)                   Utility Commands                  AWK(1)


       The environment variable AWKPATH specifies a  search  path
       to use when finding source files named with the -f option.
       If this variable does  not  exist,  the  default  path  is
       ".:/usr/lib/awk:/usr/local/lib/awk".  If a file name given
       to the -f option  contains  a  ``/''  character,  no  path
       search is performed.

       Gawk executes AWK programs in the following order.  First,
       gawk compiles the program into an  internal  form.   Next,
       all  variable  assignments specified via the -v option are
       performed.  Then, gawk executes  the  code  in  the  BEGIN
       block(s)  (if  any),  and  then proceeds to read each file
       named in the ARGV array.  If there are no files  named  on
       the command line, gawk reads the standard input.

       If  a filename on the command line has the form var=val it
       is treated as a variable assignment. The variable var will
       be  assigned the value val.  (This happens after any BEGIN
       block(s) have been run.)  Command line variable assignment
       is  most  useful  for  dynamically assigning values to the
       variables AWK uses to control how  input  is  broken  into
       fields  and  records.  It  is  also useful for controlling
       state if multiple passes are needed  over  a  single  data
       file.

       If  the  value  of  a  particular element of ARGV is empty
       (""), gawk skips over it.

       For each line in the  input,  gawk  tests  to  see  if  it
       matches  any pattern in the AWK program.  For each pattern
       that the line matches, the associated action is  executed.
       The  patterns  are  tested  in the order they occur in the
       program.

       Finally, after all the input is exhausted,  gawk  executes
       the code in the END block(s) (if any).

VARIABLES AND FIELDS
       AWK  variables  are dynamic; they come into existence when
       they are first used. Their  values  are  either  floating-
       point numbers or strings, or both, depending upon how they
       are used. AWK also  has  one  dimension  arrays;  multiply
       dimensioned  arrays may be simulated.  Several pre-defined
       variables are  set  as  a  program  runs;  these  will  be
       described as needed and summarized below.

   Fields
       As  each  input  line  is  read, gawk splits the line into
       fields, using the value of the FS variable  as  the  field
       separator.   If FS is a single character, fields are sepa-
       rated by that character.  Otherwise, FS is expected to  be
       a full regular expression.  In the special case that FS is
       a single blank, fields are separated  by  runs  of  blanks
       and/or  tabs.   Note  that  the  value  of IGNORECASE (see



Free Software Foundation    Jun 5 1991                          3




AWK(1)                   Utility Commands                  AWK(1)


       below) will also affect how fields are split when FS is  a
       regular expression.

       If  the  FIELDWIDTHS  variable is set to a space separated
       list of numbers, each field  is  expected  to  have  fixed
       width,  and gawk will split up the record using the speci-
       fied widths.  The value of FS is ignored.  Assigning a new
       value to FS overrides the use of FIELDWIDTHS, and restores
       the default behaviour.

       Each field in the input line  may  be  referenced  by  its
       position,  $1,  $2,  and so on.  $0 is the whole line. The
       value of a field may be assigned to as well.  Fields  need
       not be referenced by constants:

              n = 5
              print $n

       prints the fifth field in the input line.  The variable NF
       is set to the total number of fields in the input line.

       References to non-existent fields (i.e. fields after  $NF)
       produce  the  null-string.  However,  assigning  to a non-
       existent field (e.g., $(NF+2) = 5) will increase the value
       of  NF, create any intervening fields with the null string
       as their value, and cause the value of  $0  to  be  recom-
       puted,  with  the  fields  being separated by the value of
       OFS.

   Built-in Variables
       AWK's built-in variables are:


       ARGC        The number of command line arguments (does not
                   include   options  to  gawk,  or  the  program
                   source).

       ARGV        Array of command line arguments. The array  is
                   indexed  from  0  to  ARGC  -  1.  Dynamically
                   changing the contents of ARGV can control  the
                   files used for data.

       CONVFMT     The  conversion format for numbers, "%.6g", by
                   default.

       ENVIRON     An array containing the values of the  current
                   environment.   The  array  is  indexed  by the
                   environment variables, each element being  the
                   value  of that variable (e.g., ENVIRON["HOME"]
                   might be /u/arnold).  Changing this array does
                   not  affect  the  environment seen by programs
                   which gawk spawns via redirection or the  sys-
                   tem()  function.  (This may change in a future
                   version of gawk.)



Free Software Foundation    Jun 5 1991                          4




AWK(1)                   Utility Commands                  AWK(1)


       FIELDWIDTHS A white-space separated list  of  fieldwidths.
                   When set, gawk parses the input into fields of
                   fixed width, instead of using the value of the
                   FS variable as the field separator.  The fixed
                   field width facility  is  still  experimental;
                   expect the semantics to change as gawk evolves
                   over time.

       FILENAME    The name of the current  input  file.   If  no
                   files  are  specified on the command line, the
                   value of FILENAME is ``-''.

       FNR         The input record number in the  current  input
                   file.

       FS          The input field separator, a blank by default.

       IGNORECASE  Controls the case-sensitivity of  all  regular
                   expression  operations.  If  IGNORECASE  has a
                   non-zero  value,  then  pattern  matching   in
                   rules,   field   splitting  with  FS,  regular
                   expression matching with ~  and  !~,  and  the
                   gsub(),  index(),  match(), split(), and sub()
                   pre-defined functions  will  all  ignore  case
                   when   doing  regular  expression  operations.
                   Thus, if IGNORECASE is not equal to zero, /aB/
                   matches  all  of the strings "ab", "aB", "Ab",
                   and "AB".  As with all AWK variables, the ini-
                   tial value of IGNORECASE is zero, so all regu-
                   lar expression operations are  normally  case-
                   sensitive.

       NF          The  number  of  fields  in  the current input
                   record.

       NR          The total number of input records seen so far.

       OFMT        The  output  format  for  numbers,  "%.6g", by
                   default.

       OFS         The  output  field  separator,  a   blank   by
                   default.

       ORS         The output record separator, by default a new-
                   line.

       RS          The input record separator, by default a  new-
                   line.   RS  is  exceptional  in  that only the
                   first character of its string  value  is  used
                   for  separating  records.  (This will probably
                   change in a future release of gawk.)  If RS is
                   set to the null string, then records are sepa-
                   rated by blank lines.  When RS is set  to  the
                   null string, then the newline character always



Free Software Foundation    Jun 5 1991                          5




AWK(1)                   Utility Commands                  AWK(1)


                   acts as a  field  separator,  in  addition  to
                   whatever value FS may have.

       RSTART      The  index  of  the first character matched by
                   match(); 0 if no match.

       RLENGTH     The length of the string matched  by  match();
                   -1 if no match.

       SUBSEP      The  character  used to separate multiple sub-
                   scripts in array elements, by default  "\034".

   Arrays
       Arrays  are  subscripted with an expression between square
       brackets ([ and ]).  If the expression  is  an  expression
       list  (expr,  expr  ...)   then  the  array subscript is a
       string consisting of the  concatenation  of  the  (string)
       value  of  each  expression, separated by the value of the
       SUBSEP variable.  This facility is used to simulate multi-
       ply dimensioned arrays. For example:

              i = "A" ; j = "B" ; k = "C"
              x[i, j, k] = "hello, world\n"

       assigns  the string "hello, world\n" to the element of the
       array x which is indexed by the string "A\034B\034C".  All
       arrays in AWK are associative, i.e. indexed by string val-
       ues.

       The special operator in may be used  in  an  if  or  while
       statement  to see if an array has an index consisting of a
       particular value.

              if (val in array)
                   print array[val]

       If the array has multiple subscripts, use (i, j) in array.

       The in construct may also be used in a for loop to iterate
       over all the elements of an array.

       An element may be deleted from an array using  the  delete
       statement.

   Variable Typing And Conversion
       Variables  and  fields may be (floating point) numbers, or
       strings, or both. How the value of a  variable  is  inter-
       preted  depends  upon  its  context.  If used in a numeric
       expression, it will be treated as a number, if used  as  a
       string it will be treated as a string.

       To  force  a  variable to be treated as a number, add 0 to
       it; to force it to be treated as a string, concatenate  it
       with the null string.



Free Software Foundation    Jun 5 1991                          6




AWK(1)                   Utility Commands                  AWK(1)


       When  a  string must be converted to a number, the conver-
       sion is accomplished using atof(3).  A number is converted
       to  a  string  by  using  the value of CONVFMT as a format
       string for sprintf(3), with the numeric value of the vari-
       able as the argument.  However, even though all numbers in
       AWK are floating-point, integral values  are  always  con-
       verted as integers.  Thus, given

              CONVFMT = "%2.2f"
              a = 12
              b = a ""

       the variable b has a value of "12" and not "12.00".

       Gawk performs comparisons as follows: If two variables are
       numeric, they are compared numerically.  If one  value  is
       numeric  and  the  other  has  a  string  value  that is a
       ``numeric string,'' then comparisons are also done numeri-
       cally.   Otherwise,  the  numeric  value is converted to a
       string and a string comparison is performed.  Two  strings
       are  compared,  of  course,  as strings.  According to the
       POSIX standard (draft 11), even if two strings are numeric
       strings, a numeric comparison is performed.  However, this
       is clearly incorrect, and gawk does not do this.

       Uninitialized variables have the numeric value 0  and  the
       string value "" (the null, or empty, string).

PATTERNS AND ACTIONS
       AWK  is a line oriented language. The pattern comes first,
       and then the action. Action statements are enclosed  in  {
       and  }.   Either the pattern may be missing, or the action
       may be missing, but, of course, not both. If  the  pattern
       is  missing,  the action will be executed for every single
       line of input.  A missing action is equivalent to

              { print }

       which prints the entire line.

       Comments begin with  the  ``#''  character,  and  continue
       until  the  end  of  the line.  Blank lines may be used to
       separate statements.  Normally, a statement  ends  with  a
       newline, however, this is not the case for lines ending in
       a ``,'', ``{'', ``?'', ``:'', ``&&'',  or  ``||''.   Lines
       ending  in do or else also have their statements automati-
       cally continued on the following line.  In other cases,  a
       line  can be continued by ending it with a ``\'', in which
       case the newline will be ignored.

       Multiple statements may be put on one line  by  separating
       them  with  a  ``;''.  This applies to both the statements
       within the action part of a pattern-action pair (the usual
       case), and to the pattern-action statements themselves.



Free Software Foundation    Jun 5 1991                          7




AWK(1)                   Utility Commands                  AWK(1)


   Patterns
       AWK patterns may be one of the following:

              BEGIN
              END
              /regular expression/
              relational expression
              pattern && pattern
              pattern || pattern
              pattern ? pattern : pattern
              (pattern)
              ! pattern
              pattern1, pattern2

       BEGIN  and END are two special kinds of patterns which are
       not tested against the input.  The  action  parts  of  all
       BEGIN  patterns  are  merged  as if all the statements had
       been written in a single BEGIN block.  They  are  executed
       before  any  of  the input is read. Similarly, all the END
       blocks are merged, and executed  when  all  the  input  is
       exhausted  (or when an exit statement is executed).  BEGIN
       and END patterns cannot be combined with other patterns in
       pattern  expressions.   BEGIN and END patterns cannot have
       missing action parts.

       For /regular expression/ patterns, the  associated  state-
       ment is executed for each input line that matches the reg-
       ular expression.  Regular  expressions  are  the  same  as
       those in egrep(1), and are summarized below.

       A  relational  expression  may  use  any  of the operators
       defined below in the section on actions.  These  generally
       test  whether certain fields match certain regular expres-
       sions.

       The &&, ||, and !  operators are logical AND, logical  OR,
       and  logical  NOT,  respectively, as in C.  They do short-
       circuit evaluation, also as in C, and are used for combin-
       ing  more  primitive  pattern expressions. As in most lan-
       guages, parentheses may be used to  change  the  order  of
       evaluation.

       The  ?:  operator  is  like the same operator in C. If the
       first pattern is true then the pattern used for testing is
       the second pattern, otherwise it is the third. Only one of
       the second and third patterns is evaluated.

       The pattern1, pattern2 form of an expression is  called  a
       range pattern.  It matches all input records starting with
       a line that  matches  pattern1,  and  continuing  until  a
       record  that matches pattern2, inclusive. It does not com-
       bine with any other sort of pattern expression.





Free Software Foundation    Jun 5 1991                          8




AWK(1)                   Utility Commands                  AWK(1)


   Regular Expressions
       Regular expressions are the extended kind found in  egrep.
       They are composed of characters as follows:

       c         matches the non-metacharacter c.

       \c        matches the literal character c.

       .         matches any character except newline.

       ^         matches the beginning of a line or a string.

       $         matches the end of a line or a string.

       [abc...]  character  class,  matches any of the characters
                 abc....

       [^abc...] negated character class, matches  any  character
                 except abc...  and newline.

       r1|r2     alternation: matches either r1 or r2.

       r1r2      concatenation: matches r1, and then r2.

       r+        matches one or more r's.

       r*        matches zero or more r's.

       r?        matches zero or one r's.

       (r)       grouping: matches r.

       The  escape  sequences  that are valid in string constants
       (see below) are also legal in regular expressions.

   Actions
       Action statements are enclosed in braces, { and }.  Action
       statements  consist  of the usual assignment, conditional,
       and looping statements found in most languages. The opera-
       tors,  control  statements,  and  input/output  statements
       available are patterned after those in C.

   Operators
       The operators in AWK, in order of  increasing  precedence,
       are


       = += -=
       *= /= %= ^= Assignment.  Both  absolute  assignment (var =
                   value)  and  operator-assignment  (the   other
                   forms) are supported.

       ?:          The  C  conditional  expression.  This has the
                   form expr1 ? expr2 : expr3. If expr1 is  true,



Free Software Foundation    Jun 5 1991                          9




AWK(1)                   Utility Commands                  AWK(1)


                   the  value  of the expression is expr2, other-
                   wise it is expr3.  Only one of expr2 and expr3
                   is evaluated.

       ||          Logical OR.

       &&          Logical AND.

       ~ !~        Regular   expression   match,  negated  match.
                   NOTE: Do not use a constant regular expression
                   (/foo/)  on  the  left-hand side of a ~ or !~.
                   Only use one  on  the  right-hand  side.   The
                   expression /foo/ ~ exp has the same meaning as
                   (($0 ~ /foo/) ~ exp).   This  is  usually  not
                   what was intended.

       < >
       <= >=
       != ==       The regular relational operators.

       blank       String concatenation.

       + -         Addition and subtraction.

       * / %       Multiplication, division, and modulus.

       + - !       Unary plus, unary minus, and logical negation.

       ^           Exponentiation (** may also be used,  and  **=
                   for the assignment operator).

       ++ --       Increment and decrement, both prefix and post-
                   fix.

       $           Field reference.

   Control Statements
       The control statements are as follows:

              if (condition) statement [ else statement ]
              while (condition) statement
              do statement while (condition)
              for (expr1; expr2; expr3) statement
              for (var in array) statement
              break
              continue
              delete array[index]
              exit [ expression ]
              { statements }

   I/O Statements
       The input/output statements are as follows:





Free Software Foundation    Jun 5 1991                         10




AWK(1)                   Utility Commands                  AWK(1)


       close(filename)       Close file (or pipe, see below).

       getline               Set $0 from next input  record;  set
                             NF, NR, FNR.

       getline <file         Set $0 from next record of file; set
                             NF.

       getline var           Set var from next input record;  set
                             NF, FNR.

       getline var <file     Set var from next record of file.

       next                  Stop  processing  the  current input
                             record. The  next  input  record  is
                             read and processing starts over with
                             the first pattern in  the  AWK  pro-
                             gram.  If  the end of the input data
                             is reached,  the  END  block(s),  if
                             any, are executed.

       print                 Prints the current record.

       print expr-list       Prints expressions.

       print expr-list >file Prints expressions on file.

       printf fmt, expr-list Format and print.

       printf fmt, expr-list >file
                             Format and print on file.

       system(cmd-line)      Execute  the  command  cmd-line, and
                             return the exit status.   (This  may
                             not  be  available on non-POSIX sys-
                             tems.)

       Other input/output  redirections  are  also  allowed.  For
       print and printf, >>file appends output to the file, while
       | command writes on a pipe.  In a similar fashion, command
       |  getline  pipes  into getline.  Getline will return 0 on
       end of file, and -1 on an error.

   The printf Statement
       The AWK versions of the  printf  statement  and  sprintf()
       function (see below) accept the following conversion spec-
       ification formats:

       %c     An ASCII character.  If the argument used for %c is
              numeric,  it is treated as a character and printed.
              Otherwise, the argument is assumed to be a  string,
              and  the  only  first  character  of that string is
              printed.




Free Software Foundation    Jun 5 1991                         11




AWK(1)                   Utility Commands                  AWK(1)


       %d     A decimal number (the integer part).

       %i     Just like %d.

       %e     A   floating   point    number    of    the    form
              [-]d.ddddddE[+-]dd.

       %f     A  floating point number of the form [-]ddd.dddddd.

       %g     Use e or f conversion, whichever is  shorter,  with
              nonsignificant zeros suppressed.

       %o     An unsigned octal number (again, an integer).

       %s     A character string.

       %x     An unsigned hexadecimal number (an integer).

       %X     Like %x, but using ABCDEF instead of abcdef.

       %%     A single % character; no argument is converted.

       There  are  optional,  additional  parameters that may lie
       between the % and the control letter:

       -      The expression should be left-justified within  its
              field.

       width  The  field  should  be padded to this width. If the
              number has a leading zero, then the field  will  be
              padded  with  zeros.   Otherwise  it is padded with
              blanks.

       .prec  A number indicating the maximum width of strings or
              digits to the right of the decimal point.

       The  dynamic  width  and  prec  capabilities of the ANSI C
       printf() routines are supported.  A * in place  of  either
       the  width  or prec specifications will cause their values
       to be taken from the argument list to printf or sprintf().

   Special File Names
       When  doing  I/O  redirection  from either print or printf
       into a file, or via getline from a file,  gawk  recognizes
       certain  special  filenames  internally.   These filenames
       allow access  to  open  file  descriptors  inherited  from
       gawk's  parent process (usually the shell).  The filenames
       are:

       /dev/stdin
                 The standard input.

       /dev/stdout
                 The standard output.



Free Software Foundation    Jun 5 1991                         12




AWK(1)                   Utility Commands                  AWK(1)


       /dev/stderr
                 The standard error output.

       /dev/fd/n The file denoted by the open file descriptor  n.

       These  are  particularly  useful  for  error messages. For
       example:

              print "You blew it!" > "/dev/stderr"

       whereas you would otherwise have to use

              print "You blew it!" | "cat 1>&2"

       These file names may also be used on the command  line  to
       name data files.

   Numeric Functions
       AWK has the following pre-defined arithmetic functions:


       atan2(y, x) returns the arctangent of y/x in radians.

       cos(expr)   returns the cosine in radians.

       exp(expr)   the exponential function.

       int(expr)   truncates to integer.

       log(expr)   the natural logarithm function.

       rand()      returns a random number between 0 and 1.

       sin(expr)   returns the sine in radians.

       sqrt(expr)  the square root function.

       srand(expr) use  expr  as a new seed for the random number
                   generator. If no expr is provided, the time of
                   day  will  be  used.   The return value is the
                   previous seed for the random number generator.

   String Functions
       AWK has the following pre-defined string functions:


       gsub(r, s, t)           for  each  substring  matching the
                               regular expression r in the string
                               t,  substitute  the  string s, and
                               return  the  number  of  substitu-
                               tions.   If t is not supplied, use
                               $0.

       index(s, t)             returns the index of the string  t



Free Software Foundation    Jun 5 1991                         13




AWK(1)                   Utility Commands                  AWK(1)


                               in  the string s, or 0 if t is not
                               present.

       length(s)               returns the length of  the  string
                               s, or the length of $0 if s is not
                               supplied.

       match(s, r)             returns the position  in  s  where
                               the  regular  expression r occurs,
                               or 0 if r is not present, and sets
                               the  values of RSTART and RLENGTH.

       split(s, a, r)          splits the string s into the array
                               a on the regular expression r, and
                               returns the number of fields. If r
                               is omitted, FS is used instead.

       sprintf(fmt, expr-list) prints expr-list according to fmt,
                               and returns the resulting  string.

       sub(r, s, t)            just  like  gsub(),  but  only the
                               first   matching   substring    is
                               replaced.

       substr(s, i, n)         returns  the n-character substring
                               of s starting at i.  If n is omit-
                               ted, the rest of s is used.

       tolower(str)            returns  a copy of the string str,
                               with all the upper-case characters
                               in  str translated to their corre-
                               sponding lower-case  counterparts.
                               Non-alphabetic characters are left
                               unchanged.

       toupper(str)            returns a copy of the string  str,
                               with all the lower-case characters
                               in str translated to their  corre-
                               sponding  upper-case counterparts.
                               Non-alphabetic characters are left
                               unchanged.

   Time Functions
       Since  one of the primary uses of AWK programs in process-
       ing log files that contain time  stamp  information,  gawk
       provides  the  following  two functions for obtaining time
       stamps and formatting them.


       systime() returns the current time of day as the number of
                 seconds  since  the Epoch (Midnight UTC, January
                 1, 1970 on POSIX systems).





Free Software Foundation    Jun 5 1991                         14




AWK(1)                   Utility Commands                  AWK(1)


       strftime(format, timestamp)
                 formats timestamp according to the specification
                 in  format.  The timestamp should be of the same
                 form as returned by systime().  If timestamp  is
                 missing,  the  current time of day is used.  See
                 the specification for the strftime() function in
                 ANSI C for the format conversions that are guar-
                 anteed to be available.  A public-domain version
                 of strftime(3) and a man page for it are shipped
                 with gawk; if that version  was  used  to  build
                 gawk,  then  all of the conversions described in
                 that man page are available to gawk.

   String Constants
       String  constants  in  AWK  are  sequences  of  characters
       enclosed  between  double quotes ("). Within strings, cer-
       tain escape sequences are recognized, as in C. These are:


       \\   A literal backslash.

       \a   The ``alert'' character; usually the ASCII BEL  char-
            acter.

       \b   backspace.

       \f   form-feed.

       \n   new line.

       \r   carriage return.

       \t   horizontal tab.

       \v   vertical tab.

       \xhex digits
            The  character represented by the string of hexadeci-
            mal digits following the \x.  As in ANSI C, all  fol-
            lowing  hexadecimal digits are considered part of the
            escape sequence.  (This feature should tell us  some-
            thing  about  language  design  by committee.)  E.g.,
            "\x1B" is the ASCII ESC (escape) character.

       \ddd The character represented by the 1-, 2-,  or  3-digit
            sequence  of  octal  digits. E.g. "\033" is the ASCII
            ESC (escape) character.

       \c   The literal character c.

       The escape sequences may also be used inside constant reg-
       ular expressions (e.g., /[ \t\f\n\r\v]/ matches whitespace
       characters).




Free Software Foundation    Jun 5 1991                         15




AWK(1)                   Utility Commands                  AWK(1)


FUNCTIONS
       Functions in AWK are defined as follows:

              function name(parameter list) { statements }

       Functions are executed when called from within the  action
       parts of regular pattern-action statements. Actual parame-
       ters supplied in the function call are used to instantiate
       the  formal  parameters  declared in the function.  Arrays
       are passed by reference, other  variables  are  passed  by
       value.

       Since  functions  were not originally part of the AWK lan-
       guage, the provision for local variables is rather clumsy:
       They  are  declared  as  extra parameters in the parameter
       list. The convention is to separate local  variables  from
       real parameters by extra spaces in the parameter list. For
       example:

              function  f(p, q,     a, b) { # a & b are local
                             ..... }

              /abc/     { ... ; f(1, 2) ; ... }

       The left parenthesis in a function  call  is  required  to
       immediately  follow  the function name, without any inter-
       vening white space.  This is to avoid a syntactic  ambigu-
       ity  with  the  concatenation  operator.  This restriction
       does not apply to the built-in functions listed above.

       Functions may call each other and may be recursive.  Func-
       tion parameters used as local variables are initialized to
       the null string and the number zero upon function  invoca-
       tion.

       The word func may be used in place of function.

EXAMPLES
       Print and sort the login names of all users:

            BEGIN     { FS = ":" }
                 { print $1 | "sort" }

       Count lines in a file:

                 { nlines++ }
            END  { print nlines }

       Precede each line by its number in the file:

            { print FNR, $0 }

       Concatenate and line number (a variation on a theme):




Free Software Foundation    Jun 5 1991                         16




AWK(1)                   Utility Commands                  AWK(1)


            { print NR, $0 }

SEE ALSO
       egrep(1)

       The  AWK  Programming  Language,  Alfred  V. Aho, Brian W.
       Kernighan, Peter J. Weinberger, Addison-Wesley, 1988. ISBN
       0-201-07981-X.

       The  GAWK  Manual,  published by the Free Software Founda-
       tion, 1991.

POSIX COMPATIBILITY
       A primary goal for gawk is compatibility  with  the  POSIX
       standard,  as well as with the latest version of UNIX awk.
       To this end, gawk incorporates the following user  visible
       features  which are not described in the AWK book, but are
       part of awk in System V Release 4, and are  in  the  POSIX
       standard.

       The  -v option for assigning variables before program exe-
       cution starts is new.  The  book  indicates  that  command
       line  variable assignment happens when awk would otherwise
       open the argument as a file,  which  is  after  the  BEGIN
       block  is  executed.  However, in earlier implementations,
       when such an assignment appeared before  any  file  names,
       the  assignment  would  happen  before the BEGIN block was
       run.  Applications came to  depend  on  this  ``feature.''
       When  awk  was  changed  to  match its documentation, this
       option was added to accomodate applications that  depended
       upon  the old behaviour.  (This feature was agreed upon by
       both the AT&T and GNU developers.)

       The -W option for implementation specific features is from
       the POSIX standard.

       When  processing  arguments,  gawk uses the special option
       ``--'' to signal the end of arguments,  and  warns  about,
       but otherwise ignores, undefined options.

       The  AWK book does not define the return value of srand().
       The System V Release 4 version of UNIX awk (and the  POSIX
       standard)  has  it  return the seed it was using, to allow
       keeping  track  of  random  number  sequences.   Therefore
       srand() in gawk also returns its current seed.

       Other  new  features  are:  The use of multiple -f options
       (from MKS awk); the ENVIRON array; the \a, and  \v  escape
       sequences  (done  originally  in  gawk  and  fed back into
       AT&T's); the tolower() and  toupper()  built-in  functions
       (from  AT&T);  and the ANSI C conversion specifications in
       printf (done first in AT&T's version).





Free Software Foundation    Jun 5 1991                         17




AWK(1)                   Utility Commands                  AWK(1)


GNU EXTENSIONS
       Gawk has some extensions to POSIX awk.  They are described
       in this section.  All the extensions described here can be
       disabled by invoking gawk with the -W compat option.

       The following features of gawk are not available in  POSIX
       awk.

              o The \x escape sequence.

              o The systime() and strftime() functions.

              o The special file names available for I/O redirec-
                tion are not recognized.

              o The IGNORECASE variable and its side-effects  are
                not available.

              o The  FIELDWIDTHS  variable  and fixed width field
                splitting.

              o No path search is performed for files  named  via
                the -f option.  Therefore the AWKPATH environment
                variable is not special.

       The AWK book does not  define  the  return  value  of  the
       close()  function.   Gawk's close() returns the value from
       fclose(3), or pclose(3), when  closing  a  file  or  pipe,
       respectively.

       When  gawk is invoked with the -W compat option, if the fs
       argument to the -F option is ``t'', then FS will be set to
       the  tab  character.   Since this is a rather ugly special
       case, it is not the default behavior.

BUGS
       The -F option is not  necessary  given  the  command  line
       variable assignment feature; it remains only for backwards
       compatibility.

VERSION INFORMATION
       This man page documents gawk, version 2.13.

       For the 2.13 version of gawk, the -c, -V, -C, -a,  and  -e
       options of the 2.11 version are recognized.  However, gawk
       will print a warning message, and these  options  will  go
       away in the 2.14 version.

       The  2.12  version  was a development version that was not
       officially released.

AUTHORS
       The original version of UNIX awk was designed  and  imple-
       mented   by   Alfred  Aho,  Peter  Weinberger,  and  Brian



Free Software Foundation    Jun 5 1991                         18




AWK(1)                   Utility Commands                  AWK(1)


       Kernighan of AT&T Bell Labs. Brian Kernighan continues  to
       maintain and enhance it.

       Paul  Rubin and Jay Fenlason, of the Free Software Founda-
       tion, wrote gawk, to be compatible with the original  ver-
       sion  of  awk  distributed  in Seventh Edition UNIX.  John
       Woods contributed a number of bug fixes.  David Trueman of
       Dalhousie  University, with contributions from Arnold Rob-
       bins at Emory University and AudioFAX, made gawk  compati-
       ble with the new version of UNIX awk.

ACKNOWLEDGEMENTS
       Brian  Kernighan of Bell Labs provided valuable assistance
       during testing and debugging.  We thank him.











































Free Software Foundation    Jun 5 1991                         19


Typewritten Software • bear@typewritten.org • Edmonds, WA 98026