Lili Qiu, George Varghese and Suri Subash
July 14, 2000
URN = ncstrl.cornell/TR2000-1805
URL = http://cs-tr.cs.cornell.edu:80/Dienst/UI/1.0/Display/ncstrl.cornell/TR2000-1805
Abstract: Routers must perform packet classification at high speeds
to efficiently implement functions such as
firewalls. The classification can be based on an arbitrary number of
prefix and range fields in the packet
header. The classification required for firewalls is beyond the capabilities
offered by standard Operating
System classifiers such as BPF~\cite{MJ93}, DPF~\cite{EK96}, PathFinder~\cite{BGS94}
and others. In
fact, there are theoretical results that show the general firewall
lassification problem has poor worst case cost:
for searching over N arbitrary filters using k packet fields, either
the worst-case search time is \Omega((\log
N)^{k-1}) or the worst-case storage is O(N^{k}). In this paper, we
re-examine two basic mechanisms that
have been dismissed in the literature as being too inefficient: backtracking
search and set pruning trees. We
find using real databases that the time for backtracking search is
much better than the worst case bound;
instead of \Omega((logN)^{k-1}), the search time is only roughly twice
the optimal search
time\footnote{\scriptsize The height of the multiplane trie is regarded
as optimal search time throughout the
paper, unless otherwise specified.}. Similarly, we find that set pruning
trees (using a DAG optimization) have
much better storage costs than the worst case bound; it has memory
requirements similar to the RFC scheme
of Gupta and McKeown~\cite{GM99}. We also propose several new techniques
to further improve the two
basic mechanisms. Our major ideas are a novel compression algorithm,
the ability to trade smoothly between
backtracking and set pruning, and algorithms to effectively make use
of hardware if hardware is available. We
quantify the performance gain of each technique using real databases.
We show that on real firewall databases
our schemes, with the accompanying optimizations, are close to optimal
in time and storage.