Content Switch Literature Search Results

(current status)

Design, Implementation and Performance of a Content-Based Switch

George Apostolopoulos (University of Maryland),
David Aubespin (IBM T.J. Watson Research Center),
Vinod Peris (Growth Networks, Inc),
Prashant Pradhan (SUNY at Stony Brook),
Debanjan Saha (Bell Labs, Lucent Technologies)

Proc. Infocom2000, Tel Aviv, March 26 - 30, 2000,
http://www.ieee-infocom.org/2000/papers/440.ps

Abstract—In this paper, we share our experience in designing and building
a content based switch which we call L5. In addition to the layer 2-3-4
information available in the packet, a content based switch uses application
level information to route traffic in the network. Making routing decisions
based on information contained in the payload is not a new idea. In fact application
level proxies which are functionally equivalent to a content-based
switch, have been around for years.
Our contribution is in combining the functionalities of an application
level proxy with the data handling capabilities of a switch into a single system.
In this paper, we describe the architecture of the L5 system along
with the details of how application level information can be efficiently processed
in switch hardware. We cover two specific application examples that
we believe are ideal candidates for content-based switching: one is routing
HTTP sessions based on Uniform Resource Locators (URL) and the other
is session-aware dispatching of Secure Socket Layer (SSL) connections.

Fast and scalable layer four switching

V. Srinivasan, G. Varghese, S. Suri and M. Waldvogel

Proceedings of the ACM SIGCOMM '98, August 31 - September 4, 1998, Vancouver Canada Pages 191 - 202
http://www.acm.org/pubs/articles/proceedings/comm/285237/p191-srinivasan/p191-srinivasan.pdf

Abstract
In Layer Four switching, the route and resources allocated
to a packet are determined by the destination address as well
as other header fields of the packet such as source address,
TCP and UDP port numbers. Layer Four switching unifies
firewall processing, RSVP style resource reservation filters,
QoS Routing, and normal unicast and multicast forwarding
into a single framework. In this framework, the forwarding
database of a router consists of a potentially large number
of filters on key header fields. A given packet header can
match multiple filters, so each filter is given a cost, and the
packet is forwarded using the least cost matching filter.

In this paper, we describe two new algorithms for solving
the least cost matching filter problem at high speeds.
Our first algorithm is based on a grid-of-tries construction
and works optimally for processing filters consisting of two
prefix fields (such as destination-source filters) using linear
space. Our second algorithm, cross-producting, provides
fast lookup times for arbitrary filters but potentially requires
large storage. We describe a combination scheme that combines
the advantages of both schemes. The combination
scheme can be optimized to handle pure destination prefix
filters in 4 memory accesses, destination-source filters in 8
memory accesses worst case, and all other filters in 11 memory
accesses in the typical case.

Issues and Trends in router design

S. Keshav and R. Sharma
{skeshav, sharma}@cs.cornell.edu
Department of Computer Science
Cornell University

IEEE Communications Magazine, May 1998.
http://www.cs.cornell.edu/skeshav/papers/routertrends.pdf
This paper describes the advances in router design.

An extensible probe architecture for network protocol performance measurement

G. Robert Malan and Farnam Jahanian

Department of Electrical Engineering and Computer Science
University of Michigan
1301 Beat Avenue
Ann Arbor, Michigan 48109-2122
+l 734 647 8086
{rmalan, farnam}@eecs.umich.edu

Proceedings of the ACM SIGCOMM '98, August 31 - September 4, 1998, Vancouver Canada Pages 215 - 227
http://www.acm.org/pubs/articles/proceedings/comm/285237/p215-malan/p215-malan.pdf
(require acm digital library access permission)

Abstract--This paper describes the architecture and
implementation of Windmill, a passive network
protocol performance measurement tool. Windmill
enables experimenters to measure a broad
range of protocol performance metrics by both
reconstructing application-level network protocols
and exposing the underlying protocol layers’
events. Windmill is split into three
functional components: a dynamically compiled
Windmill Protocol Filter (WPF), a set of
abstract protocol modules, and an extensible
experiment engine. To demonstrate Windmill’s
utility, the results from several experiments are
presented. The first set of experiments suggests
a possible cause for the correlation between
Internet routing instability and network utilization.
The second set of experiments highlights:
Windmill’s ability to act as a driver for a complementary
active Internet measurement apparatus,
its ability to perform online data
reduction, and the non-intrusive measurement
of a closed system.

Router plugins: a software architecture for next generation routers

Dan Decasper, Zubin Dittia, Guru Parulkar and
[danlplattner] @ tik.ee.ethz.ch, [zubinlguru] @ arl.wustl.edu
I Computer Eng ineering and Networks Laboratory, ETH Zurich, Switzerland
Phone: +41-l -632 7019 Fax: +41-l -632 1035

Bernhard Plattner
Applied Research Laboratory, Washington University, St. Louis, USA
Phone: +l -314-935 4586 Fax: +l -314-935 7302

Proceedings of the ACM SIGCOMM '98, August 31 - September 4, 1998, Vancouver Canada Pages 229 - 240
http://www.acm.org/pubs/articles/proceedings/comm/285237/p229-decasper/p229-decasper.pdf
(require acm digital library access permission)

ABSTRACT--Present day routers typically employ monolithic
operating systems which are not easily upgradahle
and extensible. With the rapid rate of protocol
development it is becoming increasingly important
to dynamically upgrade router software in an incremental fashion.
We have designed and implemented
a high performance, modular, extended integrated
services router software architecture in the NetBSD
operating system kernel. This architecture allows
code modules, called plugins, to be dynamically
added and configured at run time. One of the novel
features of our design is the ability to bind different
plugins to individual flows; this allows for distinct
plugin implementations to seamlessly coexist in the
same runtime environment. High performance is
achieved through a carefully designed modular
architecture; an innovative packet classification
algorithm that is both powerful and highly efficient;
and by caching that exploits the flow-like characteristics
of Internet traffic. Compared to a monolithic
best-effort kernel, our implementation requires an
average increase in packet processing overhead of
only 8 % , or 500 cycles/2.lms per packet when running
on a P6/233.

Packet Classification, Route Lookup/Classification:

Fast Firewall Implementations for Softwarebased Routers

Lili Qiu
lqiu@cs.cornell.edu
Cornell University

George Varghese
varghese@cs.ucsd.edu
University of California, San Diego

Subhash Suri
suri@cs.wustl.edu
Washington University in St. Louis

Univ. of Cornell/TR 2000-1805, July 14, 2000
http://cs-tr.cs.cornell.edu:80/Dienst/UI/1.0/Display/ncstrl.cornell/TR2000-1805

Abstract: Routers must perform packet classification at high speeds to efficiently implement functions such as
Firewalls. The classification can be based on an arbitrary number of prefix and range fields in the packet
header. The classification required for Firewalls is beyond the capabilities offered by standard Operating
System classifiers such as BPF~\cite{MJ93}, DPF~\cite{EK96}, PathFinder~\cite{BGS94} and others. In
fact, there are theoretical results that show the general Firewall classification problem has poor worst case cost:
for searching over N arbitrary filters using k packet fields, either the worst-case search time is \Omega((\log
N)^{k-1}) or the worst-case storage is O(N^{k}). In this paper, we re-examine two basic mechanisms that
have been dismissed in the literature as being too inefficient: backtracking search and set pruning trees. We find
using real databases that the time for backtracking search is much better than the worst case bound; instead of
\Omega((logN)^{k-1}), the search time is only roughly twice the optimal search time\footnote{\scriptsize The
height of the multiplane trie is regarded as optimal search time throughout the paper, unless otherwise
specified.}. Similarly, we find that set pruning trees (using a DAG optimization) have much better storage costs
than the worst case bound; it has memory requirements similar to the RFC scheme of Gupta and
McKeown~\cite{GM99}. We also propose several new techniques to further improve the two basic
mechanisms. Our major ideas are a novel compression algorithm, the ability to trade smoothly between
backtracking and set pruning, and algorithms to effectively make use of hardware if hardware is available. We
quantify the performance gain of each technique using real databases. We show that on real Firewall databases
our schemes, with the accompanying optimizations, are close to optimal in time and storage.

Dynamic Algorithms with Worst-case
Performance for Packet Classification

Pankaj Gupta 1 and Nick McKeown
fpankaj, nickmg@stanford.edu

Computer Systems Laboratory, Stanford University
Stanford, CA 94305-9030

Proc. IFIP Networking, May 2000, Paris, France.
http://www-cs-students.Stanford.edu/~pankaj/paps/ifip00.pdf

Abstract. Packet classi cation involves --given a set of rules -- finding the
highest priority rule matching an incoming packet. When designing
packet classi cation algorithms, three metrics need to be considered:
query time, update time and storage requirements. The algorithms proposed
to-date have been heuristics that exploit structure inherent in
the classification rules, and/or trade o one or more metrics for others.
In this paper, we describe two new simple dynamic classi cation algorithms,
Heap-on-Trie or HoT and Binarysearchtree-on-Trie or BoT for
general classifiers. The performance of these algorithms is considered in
the worst-case, i.e., without assumptions about structure in the classification
rules. They are also designed to perform well (though not necessarily
the "best") in each of the metrics simultaneously.

Tradeoffs for Packet Classification

Anja Feldmann S. Muthukrishnan
AT&T Labs–Research
Florham Park, NJ
fanja,muthug@research.att.com

Proceedings of Gigabit Networking Workshop GBN 2000, 26 March 2000 - Tel Aviv, Israel
http://www.comsoc.org/socstr/techcom/tcgn/conference/gbn2000/anja-paper.pdf

Abstract—We present an algorithmic framework for solving the packet
classification problem that allows various access time vs. memory tradeoffs.
It reduces the multi-dimensional packet classification problem to solving a
few instances of the one-dimensional IP lookup problem. It gives the best
known lookup performance with moderately large memory space. Furthermore,
it efficiently supports a reasonable number of additions and deletions
to the rulesets without degrading the lookup performance. We perform
a thorough experimental study of the tradeoffs for the two-dimensional
packet classification problem on rulesets derived from datasets collected
from AT&T WorldNet, an Internet Service Provider.

A Modular Approach to Packet Classification: Algorithms and Results

Thomas Woo (Bell Laboratories, Lucent Technologies)

Proc. Infocom2000, Tel Aviv, March 26 - 30, 2000,
http://www.ieee-infocom.org/2000/papers/609.ps

Abstract—The ability to classify packets according to pre-defined rules
is critical to providing many sophisticated value-added services, such as se-curity,
QoS, load balancing, traffic accounting, etc. Various approaches to
packet classification have been studied in the literature with accompanying
theoretical bounds. Practical studies with results applying to large number
of filters (from 8K to 1 million) are rare.
In this paper, we take a practical approach to the problem of packet clas-sification.
Specifically, we propose and study a novel approach to packet
classification which combines heuristic tree search with the use of filter
buckets. Besides high performance and reasonable storage requirement,
our algorithm is unique in the sense that it can adapt to the input packet
distribution by taking into account the relative filter usage.
To evaluate our algorithms, we have developed realistic models of large
scale filter tables, and used them to drive extensive experimentation. The
results demonstrate practicality of our algorithms for even up to 1 million
filters.

Near-Optimal Routing Lookups with Bounded Worst Case Performance

Pankaj Gupta Balaji Prabhakar Stephen Boyd
Departments of Electrical Engineering and Computer Science
Stanford University, CA 94305.
pankaj@stanford.edu, balaji@isl.stanford.edu, boyd@stanford.edu

Proc. Infocom, March 2000, Tel Aviv, Israel.
http://www-cs-students.Stanford.edu/~pankaj/paps/infocom00.pdf

Abstract—The problem of route address lookup has received much attention
recently and several algorithms and data structures for performing
address lookups at high speeds have been proposed. In this paper we consider
one such data structure – a binary search tree built on the intervals
created by the routing table prefixes. We wish to exploit the difference in
the probabilities with which the various leaves of the tree (where the intervals
are stored) are accessed by incoming packets in order to speedup the
lookup process. More precisely, we seek an answer to the question “How
can the search tree be drawn so as to minimize the average packet lookup
time while keeping the worst-case lookup time within a fixed bound?” We
use ideas from information theory to derive efficient algorithms for computing
near-optimal routing lookup trees. Finally, we consider the practicality
of our algorithms through analysis and simulation.

Packet Classification on Multiple Fields

Pankaj Gupta and Nick McKcown
Computer Systems Laboratory, Stanford University
Stanford. CA 943059030
(pankaj, nickm} @stanford.edu

Proc. Sigcomm, September 1999, Harvard University.
http://www-cs-students.Stanford.edu/~pankaj/paps/sig99.pdf

Abstract
Routers classify packets to determine which flow they belong to,
and to decide what service they should receive. Classification may,
in general, be based on an arbitrary number of fields in the packet
header. Performing classification quickly on an arbitrary number of
fields is known to be difficult, and has poor worst-case performance.
In this paper, we consider a number of classifiers taken from
real networks. We find that the classifiers contain considerable
structure and redundancy that can be exploited by the classification
algorithm. In particular, we find that a simple multi-stage classification
algorithm, called RFC (recursive flow classification), can classify
30 million packets per second in pipelined hardware, or one
million packets per second in software.

Packet Classification using Hierarchical Intelligent Cuttings

Pankaj Gupta and Nick McKeown
Computer Systems Laboratory, Stanford University
Stanford, CA 94305-9030
{pankaj, nickm}@stanford.edu

Proc. Hot Interconnects VII, August 99, Stanford. Also in
IEEE Micro, pp 34-41, Vol. 20, No. 1, January/February 2000.
http://www-cs-students.Stanford.edu/~pankaj/paps/hoti99.pdf

Abstract
Internet routers that operate as firewalls, or provide a variety
of service classes, perform different operations on different
flows. A flow is defined to be all the packets sharing common
header characteristics; for example a flow may be defined as
all the packets between two specific IP addresses. In order to
classify a packet, a router consults a table (or classifier) using
one or more fields from the packet header to search for the
corresponding flow. The classifier is a list of rules that identify
each flow and the actions to be performed on each. With
the increasing demands on router performance, there is a
need for algorithms that can classify packets quickly with
minimal storage requirements and allow new flows to be frequently
added and deleted. In the worst case, packet classifi-cation
is hard requiring routers to use heuristics that exploit
structure present in the classifiers. This paper presents such a
heuristic, called HiCuts, (hierarchical intelligent cuttings),
which exploits the structure found in classifiers. We describe
HiCuts and examine its performance against real classifiers in
use today. When compared with previously described algorithms
and used to classify packets based on four header
fields, the algorithm is found to classify packets quickly and
has relatively small storage requirements.

Packet Classification using Tuple Space Search

V. Srinivasan S. Suri G. Varghese
cheenu@ccrc.wustl.edu, suri@cs.wustl.edu, varghese@ccrc.wustl.edu
Department of Cornputer Science, Washington University in St. Louis

Proc. Sigcomm99, August 30 - September 3, 1999, Cambridge United States, Pages 135 - 146
http://www.acm.org/pubs/articles/proceedings/comm/316188/p135-srinivasan/p135-srinivasan.pdf
(require acm digital library access permission)

Abstract
Routers must perform packet classification at high speeds
to efficiently implement functions such as firewalls and
QoS routing. Packet classification requires matching each
packet against a database of filters (or rules), and forwarding
the packet according to the highest priority filter.
Existing filter schemes with fast lookup time do
not scale to large filter databases. Other more scalable
schemes work for Z-dimensional filters, but their
lookup times degrade quickly with each additional dimension.
While there exist good hardware solutions, our
new schemes are geared towards software implementation.

We introduce a generic packet classification algorithm,
called Tuple Space Search (TSS). Because real databases
typically use only a small number of distinct field lengths,
by mapping filters to tuples even a simple linear search
of the tuple space can provide significant speedup over
naive linear search over the filters. Each tuple is maintained
as a hash table that can be searched in one memory access.
We then introduce techniques for further
refining the search of the tuple space, and demonstrate
their effectiveness on some firewall databases. For example,
a real database of 278 filters had a tuple space of
41 which our algorithm prunes to 11 tuples. Even as we
increased the filter database size from 1K to 100 K (using
a random two-dimensional filter generation model),
the number of tuples grew from 53 to only 186, and the
pruned tuples only grew from 1 to 4. Our Pruned Tuple
Space search is also the only scheme known to US that
allows fast updates and fast search times. We also show
a lower bound on the general tuple space search problem,
and describe an optimal algorithm, called Rectangle
Search, for two-dimensional filters.

BPF+: exploiting global data-flow optimization in a
generalized packet filter architecture

Andrew Begel, Steven McCanne and Susan L. Graham
University of California, Berkeley
{ abegel, mccanne, graham} @cs.berkeley.edu

Proc. Sigcomm99, August 30 - September 3, 1999, Cambridge United States, Pages 123 - 134
http://www.acm.org/pubs/articles/proceedings/comm/316188/p123-begel/p123-begel.pdf
(require acm digital library access permission)

Abstract
A packer filter is a programmable selection criterion for classifying
or selecting packets from a packet stream in a generic, reusable
fashion. Previous work on packet filters falls roughly into two categories,
namely those efforts that investigate flexible and extensible
filter abstractions but sacrifice performance, and those that focus
on low-level, optimized filtering representations but sacrifice flexibility.
Applications like network monitoring and intrusion detection,
however, require both high-level expressiveness and raw performance.
In this paper, we propose a fully general packet filter
framework that affords both a high degree of flexibility and good
performance. In our framework, a packet filter is expressed in a
high-level language that is compiled into a highly efficient native
implementation. The optimization phase of the compiler uses a
flowgraph set relation called edge dominators and the novel application
of an optimization technique that we call “redundant predicate
elimination,” in which we interleave partial redundancy elimination,
predicate assertion propagation, and flowgraph edge elimination
to carry out the filter predicate optimization. Our resulting
packet-filtering framework, which we call BPF+, derives from the
BSD packet filter (BPF), and includes a filter program translator, a
byte code optimizer, a byte code safety verifier to allow code lo migrate
across protection boundaries, and a just-in-time assembler to
convert byte codes to efficient native code. Despite the high degree
of flexibility afforded by our generalized framework, our performance
measurements show that our system achieves performance
comparable to state-of-the-art packet filter architectures and better
than hand-coded filters written in C.

High-speed policy-based packet forwarding using efficient multi-dimensional range matching

T. V. Lakshman and D. Stiliadis
Bell Laboratories
101 Crawfords Corner Rd.
HolmdeP, NJ 07733
{lakshman, stiliadi )@bell-labs.com

Proceedings of the ACM SIGCOMM '98, August 31 - September 4, 1998, Vancouver Canada Pages 203 - 214
http://www.acm.org/pubs/articles/proceedings/comm/285237/p203-lakshman/p203-lakshman.pdf
(require acm digital library access permission)

Abstract
The ability to provide differentiated services to users with
widely varying requirements is becoming increasingly important,
and Internet Service Providers would like to provide
these differentiated services using the same shared network
infrastructure. The key mechanism, that enables differentiation
in a connectionless network, is the packet classification
function that parses the headers of the packets,
and after determining their context, classifies them based
on administrative policies or real-time reservation decisions.
Packet classification, however, is a complex operation that
can become the bottleneck in routers that try to support
gigabit link capacities. Hence, many proposals for differentiated
services only require classification at lower speed
edge routers and also avoid classification based on multiple
fields in the packet header even if it might be advantageous
to service providers. In this paper, we present new packet
classification schemes that, with a worst-case and traffic
independent performance metric, can classify packets, by
checking amongst a few thousand filtering rules, at rates of
a million packets per second using range matches on more
than 4 packet header fields. For a special case of classification
in two dimensions, we present an algorithm that can
handle more than 128K rules at these speeds in a traffic independent
manner. We emphasize worst-case performance
over average case performance because providing differentiated
services requires intelligent queueing and scheduling of
packets that precludes any significant queueing before the
differentiating step (i.e., before packet classification). The
presented filtering or classification schemes can be used to
classify packets for security policy enforcement, applying resource
management decisions, flow identification for RSVP
reservations, multicast look-ups, and for source-destination
and policy based routing. The scalability and performance
of the algorithms have been demonstrated by implementation
and testing in a prototype system.

Small forwarding tables for fast routing lookups

Mikael Degermark, Andrej Brodnik, Svante Carlsson and Stephen Pink
micke@cdt.luth.se, Andrej.Brodnik@IMFM.Uni-Lj.SI, svante@sm.luth.se, steve@sics.se
Department of Computer Science and Electrical Engineering
Luleg University of Technology
S-971 87 Luleb, Sweden

Proceedings of the ACM SIGCOMM '97 September 14 - 18, 1997, Cannes France Pages 3 - 14
http://www.acm.org/pubs/articles/proceedings/comm/263105/p3-degermark/p3-degermark.pdf
(require acm digital library access permission)

Abstract
For some time, the networking community has assumed
that it is impossible to do IP routing lookups in software
fast enough to support gigabit speeds. IP routing
lookups must find the routing entry with the longed
matching prefix, a task that has been thought to require
hardware support at lookup frequencies of millions per
second.
We present a forwarding table data structure designed
for quick routing lookups. Forwarding tables are
small enough to fit in the cache of a conventional general
purpose processor. With the table in cache, a 200 MHz
Pentium Pro or a 333 MHz Alpha 21164 can perform a
few million lookups per second. This means that it is
feasible to do a full routing lookup for each IP packet
at gigabit speeds without special hardware.
The forwarding tables are very small, a large routing
table with 40,000 routing entries can be compacted to a
forwarding table of 150-160 Kbytes. A lookup typically
requires less than 100 instructions on an Alpha, using
eight memory references accessing a total of 14 bytes.

Scalable high speed IP routing lookups

Marcel Waldvogel, George Varghese, Jon Turner and Bernhard Plattner

Computer Engineering and Networks Laboratory
ETH Ziirich, Switzerland
{ waldvogel,plattner) @tik.ee.ethz.ch

Computer and Communications Research Center
Washington University in St. Louis, USA
{varghesedst} @ccrc.wustl.edu

Proceedings of the ACM SIGCOMM '97 September 14 - 18, 1997, Cannes France Pages 25 - 36
http://www.acm.org/pubs/articles/proceedings/comm/263105/p25-waldvogel/p25-waldvogel.pdf
(require acm digital library access permission)

Abstract
Internet address lookup is a challenging problem because of increasing
routing table sizes, increased traffic, higher speed links, and the
migration to 128 bit IPv6 addresses. IP routing lookup requires
computing the best matching prefix, for which standard solutions
like hashing were believed to be inapplicable. The best existing solution
we know of, BSD radix tries, scales badly as IP moves to
128 bit addresses, Our paper describes a new algorithm for best
matching prefx using binary search on hash tables organized by
prefix lengths. Our scheme scales very well as address and routing
table sizes increase: independent of the table size, it requires a worst
case time of log~(address bits) hash lookups. Thus only 5 hash
lookups are needed for IPv4 and 7 for IPv6. We also introduce Mutating
Binary Search and other optimizations that, for a typical IPv4
backbone router with over 33,000 entries, considerably reduce the
average number of hashes to less than 2, of which one hash can be
simplilied to an indexed array access. We expect similar average
case behavior for IPv6.

Policy Related Papers:

Policy Evaluation for Network Management

Randeep Bhatia (Bell Labs, Lucent Technologies),
Jorge Lobo (Bell Labs, Lucent Technologies),
Madhur Kohli (Bell Labs, Lucent Technologies)

Proc. Infocom2000, Tel Aviv, March 26 - 30, 2000,
http://www.ieee-infocom.org/2000/papers/669.ps

Abstract—Policies are increasingly being used to manage com-plex
communication networks. In this paper we present our work
on a “policy server” which is being used to provide centralized ad-ministration
of packet voice gateways and “soft switches” in next
generation circuit and packet telephony networks. The policies running
in the policy server are specified using a domain independent
policy description language (PDL).

This paper is motivated by the problem of evaluating policies
specified in PDL. We present an algorithm for evaluating policies
and study both its theoretical and empirical behavior. We show that
the problem of evaluating policies is quite intractable. However we
note that the hard instances of the policy evaluation problem are
quite rare in real world networks. Under some very realistic assumptions
we are able to show that our policy evaluation algorithm
is quite efficient and is well suited for enforcing policies in complex
networks. These results constitute the first attempt to develop a formal
framework to study the informal concepts of policy based net-work
management.

Related Conferences:

Infocom
http://www.cse.ucsc.edu/~rom/infocom2000/program.html

Gigabit Networking Workshop GBN 2000 - Proceedings
http://www.comsoc.org/socstr/techcom/tcgn/conference/gbn2000/

sigcomm
http://www.acm.org/pubs/contents/proceedings/comm/285237/