Skip to content
Operations Research Center
Seminars & Events
   

Fall 2005 Seminar Series

MASSACHUSETTS INSTITUTE OF TECHNOLOGY
OPERATIONS RESEARCH CENTER
FALL 2005 SEMINAR SERIES

DATE: Thursday, September 22, 2005
LOCATION: E40-298
TIME: 4:15pm
Reception immediately following in the Philip M. Morse Reading Room, E40-106

SPEAKER:
Murali Kodialam
Research Staff Member
Networking Research Laboratory
Bell Labs

TITLE
Estimating Traffic Rates by Counting Coincidences

ABSTRACT
A typical core router in the internet processes millions of packets per second. Studies have shown that there can be hundreds of thousands of active flows at any given instant at any router. Measuring traffic for each flow at a core router, is one of the most challenging network management problems in the internet. Per-flow network traffic measurement is needed for effective traffic management, performance assessment, and detection of anomalous network events such as incipient denial of service (DoS) attacks. Explicit measurement of per-flow traffic statistics is difficult because tracking the possibly hundreds of thousands of flows needs correspondingly large high-speed memory as well as increased per-arrival processing. To reduce the measurement overhead, many previous approaches as well as commercial routers (Cisco's NetFlow) use random sampling. The memory requirement for random sampling is quite high and can only be reduced by decreasing the estimation accuracy.

 

Our goal was to develop per-flow traffic estimation methodology that has low memory requirements and has quick convergence to within a pre-specified accuracy. We achieve this by use of a novel approach based on counting coincidences in the traffic stream. (A flow has a coincidence when two consecutive samples belong to the same flow). Sampling coincidences automatically biases the samples towards the larger flows thereby making the estimation of these sources more accurate. This biased sampling leads to significantly smaller memory requirement compared to random sampling schemes. We show that the scheme is close to optimal both in terms of memory requirement as well as estimation time. The scheme is very simple to implement and performs extremely well both on synthetic and trace data.


Back to Seminar Series schedule page