Museum

Home

Lab Overview

Retrotechnology Articles

Online Manuals

⇒ bpf(4) — BSD/386 1.0

Media Vault

Software Library

Restoration Projects

Artifacts Sought

Related Articles

tcpdump(1)



BPF(4)                                                     BPF(4)


NAME
       bpf - Berkeley Packet Filter

SYNOPSIS
       pseudo-device bpfilter 16

DESCRIPTION
       The  Berkeley  Packet  Filter  provides a raw interface to
       data link layers in a protocol independent  fashion.   All
       packets  on  the  network,  even  those destined for other
       hosts, are accessible through this mechanism.

       The packet filter appears as a character  special  device,
       /dev/bpf0,  /dev/bpf1, etc.  After opening the device, the
       file descriptor must be bound to a specific network inter-
       face  with  the  BIOSETIF ioctl.  A given interface can be
       shared be multiple listeners, and  the  filter  underlying
       each  descriptor will see an identical packet stream.  The
       total number of open files is limited to the  value  given
       in the kernel configuration; the example given in the SYN-
       OPSIS above sets the limit to 16.

       A separate device file is required for each minor  device.
       If  a file is in use, the open will fail and errno will be
       set to EBUSY.

       Associated with each open instance of  a  bpf  file  is  a
       user-settable   packet   filter.   Whenever  a  packet  is
       received by an interface, all file  descriptors  listening
       on  that  interface  apply  their filter.  Each descriptor
       that accepts the packet receives its own copy.

       Reads from these files return the next  group  of  packets
       that have matched the filter.  To improve performance, the
       buffer passed to read must be the same size as the buffers
       used  internally  by  bpf.   This  size is returned by the
       BIOCGBLEN ioctl (see below), and under  BSD,  can  be  set
       with  BIOCSBLEN.   Note  that  an individual packet larger
       than this size is necessarily truncated.

       The packet filter will support  any  link  level  protocol
       that  has fixed length headers.  Currently, only Ethernet,
       SLIP and PPP drivers have been modified to  interact  with
       bpf.

       Since  packet  data is in network byte order, applications
       should use the byteorder(3n) macros to extract  multi-byte
       values.

       A  packet  can  be sent out on the network by writing to a
       bpf file descriptor.  The writes are  unbuffered,  meaning
       only  one  packet  can be processed per write.  Currently,
       only writes to Ethernets and SLIP links are supported.




                           23 May 1991                          1




BPF(4)                                                     BPF(4)


IOCTLS
       The ioctl command codes below are defined in  <net/bpf.h>.
       All commands require these includes:

            #include <sys/types.h>
            #include <sys/time.h>
            #include <sys/ioctl.h>
            #include <net/bpf.h>

       Additionally,  BIOCGETIF and BIOCSETIF require <net/if.h>.

       In addition to FIONREAD  and  SIOCGIFADDR,  the  following
       commands may be applied to any open bpf file.  The (third)
       argument to the ioctl should be  a  pointer  to  the  type
       indicated.

       BIOCGBLEN (uint)
                 Returns  the required buffer length for reads on
                 bpf files.

       BIOCSBLEN (uint)
                 Sets the buffer length for reads on  bpf  files.
                 The  buffer  must  be  set  before  the  file is
                 attached to an interface with BIOCSETIF.  If the
                 requested buffer size cannot be accomodated, the
                 closest allowable size will be set and  returned
                 in the argument.  A read call will result in EIO
                 if it is passed a buffer that is not this  size.

       BIOCGDLT (uint)
                 Returns  the  type of the data link layer under-
                 yling  the  attached   interface.    EINVAL   is
                 returned  if  no  interface  has been specified.
                 The device types, prefixed  with  ``DLT_'',  are
                 defined in <net/bpf.h>.

       BIOCPROMISC
                 Forces the interface into promiscuous mode.  All
                 packets, not just those destined for  the  local
                 host,  are  processed.  Since more than one file
                 can be listening on a given  interface,  a  lis-
                 tener    that    opened   its   interface   non-
                 promiscuously may receive packets promiscuously.
                 This problem can be remedied with an appropriate
                 filter.

                 The interface remains in promiscuous mode  until
                 all files listening promiscuously are closed.

       BIOCFLUSH Flushes  the  buffer  of  incoming  packets, and
                 resets  the  statistics  that  are  returned  by
                 BIOCGSTATS.





                           23 May 1991                          2




BPF(4)                                                     BPF(4)


       BIOCGETIF (struct ifreq)
                 Returns  the name of the hardware interface that
                 the file is listening on.  The name is  returned
                 in  the  if_name field of ifr.  All other fields
                 are undefined.

       BIOCSETIF (struct ifreq)
                 Sets the hardware interface associate  with  the
                 file.  This command must be performed before any
                 packets can be read.  The device is indicated by
                 name  using  the  ifname  field  of  the ifreq.
                 Additionally, performs the actions of BIOCFLUSH.

       BIOCSRTIMEOUT, BIOCGRTIMEOUT (struct timeval)
                 Set  or  get  the  read  timeout parameter.  The
                 timeval specifies the length  of  time  to  wait
                 before  timing  out  on  a  read  request.  This
                 parameter is initialized  to  zero  by  open(2),
                 indicating no timeout.

       BIOCGSTATS (struct bpfstat)
                 Returns   the   following  structure  of  packet
                 statistics:

                 struct bpfstat {
                      uint bsrecv;
                      uint bsdrop;
                 };

                 The fields are:

                 bsrecv        the number of packets received by
                                the  descriptor  since  opened or
                                reset  (including  any   buffered
                                since the last read call); and

                 bsdrop        the  number of packets which were
                                accepted  by   the   filter   but
                                dropped  by the kernel because of
                                buffer   overflows   (i.e.,   the
                                application's  reads aren't keep-
                                ing up with the packet  traffic).

       BIOCIMMEDIATE (uint)
                 Enable  or  disable ``immediate mode'', based on
                 the truth value of the argument.  When immediate
                 mode  is  enabled, reads return immediately upon
                 packet reception.  Otherwise, a read will  block
                 until either the kernel buffer becomes full or a
                 timeout occurs.  This  is  useful  for  programs
                 like  rarpd(8c),  which must respond to messages
                 in real time.  The default for  a  new  file  is
                 off.




                           23 May 1991                          3




BPF(4)                                                     BPF(4)


       BIOCSETF (struct bpfprogram)
                 Sets  the  filter  program used by the kernel to
                 discard  uninteresting  packets.   An  array  of
                 instructions  and  its length is passed in using
                 the following structure:

                 struct bpfprogram {
                      int bflen;
                      struct bpfinsn *bfinsns;
                 };

                 The filter program is pointed to by the bfinsns
                 field  while  its  length  in  units  of `struct
                 bpf_insn' is given by the bflen  field.   Also,
                 the actions of BIOCFLUSH are performed.

                 See section FILTER MACHINE for an explanation of
                 the filter language.

       BIOCVERSION (struct bpfversion)
                 Returns the major and minor version  numbers  of
                 the filter languange currently recognized by the
                 kernel.  Before installing  a  filter,  applica-
                 tions  must  check  that  the current version is
                 compatible with  the  running  kernel.   Version
                 numbers  are  compatible  if  the  major numbers
                 match and the application minor is less than  or
                 equal  to  the kernel minor.  The kernel version
                 number is returned in the following structure:

                 struct bpfversion {
                      ushort bvmajor;
                      ushort bvminor;
                 };

                 The  current  version  numbers  are   given   by
                 BPFMAJORVERSION   and  BPFMINORVERSION  from
                 <net/bpf.h>.  An incompatible filter may  result
                 in  undefined  behavior  (most  likely, an error
                 returned by ioctl() or haphazard  packet  match-
                 ing).

BPF HEADER
       The  following  structure  is  prepended  to  each  packet
       returned by read(2):

               struct bpfhdr {
                    struct timeval bhtstamp;
                    ulong bhcaplen;
                    ulong bhdatalen;
                    ushort bhhdrlen;
               };

       The fields, whose values are stored  in  host  order,  and



                           23 May 1991                          4




BPF(4)                                                     BPF(4)


       are:

       bhtstamp      The  time at which the packet was processed
                      by the packet filter.

       bhcaplen      The length of the captured portion  of  the
                      packet.  This is the minimum of the trunca-
                      tion amount specified by the filter and the
                      length of the packet.

       bhdatalen     The  length  of  the  packet  off the wire.
                      This value is independent of the truncation
                      amount specified by the filter.

       bhhdrlen      The length of the BPF header, which may not
                      be equal to sizeof(struct bpfhdr).

       The bhhdrlen field exists to account for padding  between
       the  header and the link level protocol.  The purpose here
       is to guarantee proper alignment of the packet data struc-
       tures,  which is required on alignment sensitive architec-
       tures and and improves performance on many other architec-
       tures.  The packet filter insures that the bpfhdr and the
       network layer header will be word aligned.  Suitable  pre-
       cautions  must be taken when accessing the link layer pro-
       tocol fields  on  alignment  restricted  machines.   (This
       isn't  a problem on an Ethernet, since the type field is a
       short falling on an even offset,  and  the  addresses  are
       probably accessed in a bytewise fashion).

       Additionally,  individual  packets are padded so that each
       starts on a word boundary.  This requires that an applica-
       tion  has  some  knowledge  of  how  to get from packet to
       packet.  The macro BPF_WORDALIGN is defined in <net/bpf.h>
       to  facilitate this process.  It rounds up its argument to
       the  nearest  word  aligned  value  (where   a   word   is
       BPF_ALIGNMENT bytes wide).

       For  example, if `p' points to the start of a packet, this
       expression will advance it to the next packet:

              p = (char *)p + BPF_WORDALIGN(p->bh_hdrlen + p->bh_caplen)

       For the alignment mechanisms to work properly, the  buffer
       passed  to read(2) must itself be word aligned.  malloc(3)
       will always return an aligned buffer.

FILTER MACHINE
       A filter program is an array  of  instructions,  with  all
       branches   forwardly  directed,  terminated  by  a  return
       instruction.  Each instruction performs some action on the
       pseudo-machine  state,  which  consists of an accumulator,
       index register, scratch memory store, and implicit program
       counter.



                           23 May 1991                          5




BPF(4)                                                     BPF(4)


       The following structure defines the instruction format:

              struct bpfinsn {
                   ushort   code;
                   uchar    jt;
                   uchar    jf;
                   long k;
              };

       The  k  field  is  used  in  differnet  ways  by different
       insutructions, and the jt and jf fields are used  as  off-
       sets  by  the branch intructions.  The opcodes are encoded
       in a semi-hierarchical fashion.  There are  eight  classes
       of intructions: BPF_LD, BPF_LDX, BPF_ST, BPF_STX, BPF_ALU,
       BPF_JMP, BPF_RET, and BPF_MISC.  Various  other  mode  and
       operator  bits  are or'd into the class to give the actual
       instructions.   The  classes  and  modes  are  defined  in
       <net/bpf.h>.

       Below  are the semantics for each defined BPF instruction.
       We use the convention that A is the accumulator, X is  the
       index  register,  P[]  packet data, and M[] scratch memory
       store.  P[i:n] gives the data at byte offset ``i'' in  the
       packet,  interpreted  as  a  word (n=4), unsigned halfword
       (n=2), or unsigned byte (n=1).  M[i] gives the  i'th  word
       in  the  scratch  memory store, which is only addressed in
       word units.   The  memory  store  is  indexed  from  0  to
       BPF_MEMWORDS-1.   k,  jt,  and  jf  are  the corresponding
       fields in the instruction definition.  ``len''  refers  to
       the length of the packet.


       BPFLD    These instructions copy a value into the accumu-
                 lator.  The type of the source operand is speci-
                 fied by an ``addressing mode'' and can be a con-
                 stant (BPFIMM), packet data at a  fixed  offset
                 (BPFABS),  packet  data  at  a  variable offset
                 (BPFIND), the packet  length  (BPFLEN),  or  a
                 word in the scratch memory store (BPFMEM).  For
                 BPFIND and BPFABS, the data size must be spec-
                 ified  as  a  word (BPFW), halfword (BPFH), or
                 byte (BPFB).  The semantics of all  the  recog-
                 nized BPF_LD instructions follow.


                 BPFLD+BPFW+BPFABS          A <- P[k:4]

                 BPFLD+BPFH+BPFABS          A <- P[k:2]

                 BPFLD+BPFB+BPFABS          A <- P[k:1]

                 BPFLD+BPFW+BPFIND          A <- P[X+k:4]

                 BPFLD+BPFH+BPFIND          A <- P[X+k:2]



                           23 May 1991                          6




BPF(4)                                                     BPF(4)


                 BPFLD+BPFB+BPFIND          A <- P[X+k:1]

                 BPFLD+BPFW+BPFLEN          A <- len

                 BPFLD+BPFIMM                A <- k

                 BPFLD+BPFMEM                A <- M[k]


       BPFLDX   These  instructions  load a value into the index
                 register.  Note that the  addressing  modes  are
                 more  retricted  than  those  of the accumulator
                 loads, but they  include  BPFMSH,  a  hack  for
                 efficiently loading the IP header length.

                 BPFLDX+BPFW+BPFIMM         X <- k

                 BPFLDX+BPFW+BPFMEM         X <- M[k]

                 BPFLDX+BPFW+BPFLEN         X <- len

                 BPFLDX+BPFB+BPFMSH         X               <-
                                               4*(P[k:1]&0xf)


       BPFST    This instruction stores the accumulator into the
                 scratch  memory.   We  do not need an addressing
                 mode since there is only one possibility for the
                 destination.

                 BPFST                        M[k] <- A


       BPFSTX   This  instruction  stores  the index register in
                 the scratch memory store.

                 BPFSTX                       M[k] <- X


       BPFALU   The alu instructions perform operations  between
                 the  accumulator and index register or constant,
                 and store the result back  in  the  accumulator.
                 For binary operations, a source mode is required
                 (BPFK or BPFX).

                 BPFALU+BPFADD+BPFK         A <- A + k

                 BPFALU+BPFSUB+BPFK         A <- A - k

                 BPFALU+BPFMUL+BPFK         A <- A * k

                 BPFALU+BPFDIV+BPFK         A <- A / k

                 BPFALU+BPFAND+BPFK         A <- A & k



                           23 May 1991                          7




BPF(4)                                                     BPF(4)


                 BPFALU+BPFOR+BPFK          A <- A | k

                 BPFALU+BPFLSH+BPFK         A <- A << k

                 BPFALU+BPFRSH+BPFK         A <- A >> k

                 BPFALU+BPFADD+BPFX         A <- A + X

                 BPFALU+BPFSUB+BPFX         A <- A - X

                 BPFALU+BPFMUL+BPFX         A <- A * X

                 BPFALU+BPFDIV+BPFX         A <- A / X

                 BPFALU+BPFAND+BPFX         A <- A & X

                 BPFALU+BPFOR+BPFX          A <- A | X

                 BPFALU+BPFLSH+BPFX         A <- A << X

                 BPFALU+BPFRSH+BPFX         A <- A >> X

                 BPFALU+BPFNEG               A <- -A


       BPFJMP   The jump instructions  alter  flow  of  control.
                 Conditional   jumps   compare   the  accumulator
                 against a constant (BPFK) or the index register
                 (BPFX).   If  the result is true (or non-zero),
                 the true branch is taken,  otherwise  the  false
                 branch  is taken.  Jump offsets are encoded in 8
                 bits so the longest jump  is  256  instructions.
                 However,  the  jump  always (BPFJA) opcode uses
                 the 32 bit k field as the offset, allowing arbi-
                 trarily  distant destinations.  All conditionals
                 use unsigned comparison conventions.

                 BPFJMP+BPFJA                pc += k

                 BPFJMP+BPFJGT+BPFK         pc += (A > k) ? jt
                                               : jf

                 BPFJMP+BPFJGE+BPFK         pc  +=  (A >= k) ?
                                               jt : jf

                 BPFJMP+BPFJEQ+BPFK         pc += (A ==  k)  ?
                                               jt : jf

                 BPFJMP+BPFJSET+BPFK        pc += (A & k) ? jt
                                               : jf

                 BPFJMP+BPFJGT+BPFX         pc += (A > X) ? jt
                                               : jf




                           23 May 1991                          8




BPF(4)                                                     BPF(4)


                 BPFJMP+BPFJGE+BPFX         pc  +=  (A >= X) ?
                                               jt : jf

                 BPFJMP+BPFJEQ+BPFX         pc += (A ==  X)  ?
                                               jt : jf

                 BPFJMP+BPFJSET+BPFX        pc += (A & X) ? jt
                                               : jf

       BPFRET   The return  instructions  terminate  the  filter
                 program  and  specify  the  amount  of packet to
                 accept  (i.e.,  they   return   the   truncation
                 amount).   A return value of zero indicates that
                 the packet should be ignored.  The return  value
                 is  either a constant (BPFK) or the accumulator
                 (BPFA).

                 BPFRET+BPFA                 accept A bytes

                 BPFRET+BPFK                 accept k bytes

       BPFMISC  The miscellaneous category was created for  any-
                 thing  that  doesn't fit into the above classes,
                 and for any new instructions that might need  to
                 be  added.   Currently,  these  are the register
                 transfer intructions that copy the index  regis-
                 ter to the accumulator or vice versa.

                 BPFMISC+BPFTAX              X <- A

                 BPFMISC+BPFTXA              A <- X

       The BPF interface provides the following macros to facili-
       tate array initializers:
              BPFSTMT(opcode, operand)
              and
              BPFJUMP(opcode,       operand,        trueoffset,
              falseoffset)


EXAMPLES
       The following filter is taken from the Reverse ARP Daemon.
       It accepts only Reverse ARP requests.

              struct bpf_insn insns[] = {
                   BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 12),
                   BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, ETHERTYPE_REVARP, 0, 3),
                   BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 20),
                   BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, REVARP_REQUEST, 0, 1),
                   BPF_STMT(BPF_RET+BPF_K, sizeof(struct ether_arp) +
                         sizeof(struct ether_header)),
                   BPF_STMT(BPF_RET+BPF_K, 0),
              };




                           23 May 1991                          9




BPF(4)                                                     BPF(4)


       This  filter  accepts  only  IP   packets   between   host
       128.3.112.15 and 128.3.112.35.

              struct bpf_insn insns[] = {
                   BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 12),
                   BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, ETHERTYPE_IP, 0, 8),
                   BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 26),
                   BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x8003700f, 0, 2),
                   BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 30),
                   BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x80037023, 3, 4),
                   BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x80037023, 0, 3),
                   BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 30),
                   BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x8003700f, 0, 1),
                   BPF_STMT(BPF_RET+BPF_K, (u_int)-1),
                   BPF_STMT(BPF_RET+BPF_K, 0),
              };

       Finally,  this filter returns only TCP finger packets.  We
       must parse the IP header to reach  the  TCP  header.   The
       BPFJSET instruction checks that the IP fragment offset is
       0 so we are sure that we have a TCP header.

              struct bpf_insn insns[] = {
                   BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 12),
                   BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, ETHERTYPE_IP, 0, 10),
                   BPF_STMT(BPF_LD+BPF_B+BPF_ABS, 23),
                   BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, IPPROTO_TCP, 0, 8),
                   BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 20),
                   BPF_JUMP(BPF_JMP+BPF_JSET+BPF_K, 0x1fff, 6, 0),
                   BPF_STMT(BPF_LDX+BPF_B+BPF_MSH, 14),
                   BPF_STMT(BPF_LD+BPF_H+BPF_IND, 14),
                   BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 79, 2, 0),
                   BPF_STMT(BPF_LD+BPF_H+BPF_IND, 16),
                   BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 79, 0, 1),
                   BPF_STMT(BPF_RET+BPF_K, (u_int)-1),
                   BPF_STMT(BPF_RET+BPF_K, 0),
              };

SEE ALSO
       tcpdump(1)

       McCanne, S., Jacobson V., `An efficient,  extensible,  and
       portable network monitor'

FILES
       /dev/bpf0, /dev/bpf1, ...

BUGS
       The  read  buffer must be of a fixed size (returned by the
       BIOCGBLEN ioctl).

       A file that does not request promiscuous mode may  receive
       promiscuously received packets as a side effect of another
       file requesting this mode on the same hardware  interface.



                           23 May 1991                         10




BPF(4)                                                     BPF(4)


       This could be fixed in the kernel with additional process-
       ing overhead.  However, we favor the model where all files
       must  assume  that the interface is promiscuous, and if so
       desired, must utilize a filter to reject foreign  packets.

       Data  link  protocols with variable length headers are not
       currently supported.

       Under SunOS, if a BPF application  reads  more  than  2^31
       bytes  of  data, read will fail in EINVAL.  You can either
       fix the bug in SunOS, or lseek to 0 when  read  fails  for
       this reason.

HISTORY
       The Enet packet filter was created in 1980 by Mike Accetta
       and Rick Rashid at  Carnegie-Mellon  University.   Jeffrey
       Mogul,  at  Stanford, ported the code to BSD and continued
       its development from 1983 on.  Since then, it has  evolved
       into the Ultrix Packet Filter at DEC, a STREAMS NIT module
       under SunOS 4.1, and BPF.

AUTHORS
       Steven McCanne, of Lawrence  Berkeley  Laboratory,  imple-
       mented  BPF  in Summer 1990.  Much of the design is due to
       Van Jacobson.
































                           23 May 1991                         11


Typewritten Software • bear@typewritten.org • Edmonds, WA 98026