Notes on Bro 0.8a58 vs. 0.8a70
To help determine the differences in behavior of 0.8a58 and 0.8a70
. While there were no direct indications that significant
internal changes were made from the CHANGES file, it seems that
something has changed which provides for less thrashing in both cpu
usage and lag time (explained below).
The configuration of the two instances was identical, except for the
addition of two policy files "flag-warez.bro" and "flag-irc.bro",
neither of which makes a significant memory or cpu impact.
Also note that there were several changes to the base bro config.
See here for details.
The data was collected from a single window with each running at the same time, so it should be treated with
the skepticism that such a limited viewing deserves.
Memory
Footprint
As seen here, the memory allocation pattern seems similar, but the
overall use is significantly lower.

The units on the vertical axis are KB, while the bottom are multiples
of 5 second intervals.
CPU
Footprint
In general, 0.8a70 took a higher cpu load than 0.8a58 . What is
interesting about the graphs are the consistency of usage - in the
older version the average usage is lower (as seen by running top while
both were running), but the variation is significantly different in the
two versions.
In the new version:

compared to the older version:

note - I retained the line representation of data to provide for a
better envelope effect. Dots or hashes end up being striated
based on the large number of data points vs the 100 possible
values. What is interesting is the relative stability of the cpu
load on the new vs old versions.
Lag Time
Lag time is the difference between 'clock time' and the timestamp
recorded in the pcap data structure. The larger the value, the
more bro has backed up computationally - ie a large value implies that
packets are backing up somewhere in the analysis line. Such
behavior need not be indicated by a large cpu value given the
significant amount of IO and interrupt behavior of the application.
Here I removed the lag data which occurred during and after the check
point time. Not only is it not particularly useful, but the
values are several orders of magnitude over the data that we want to
look at and swamp out the other data.

Here the units on the vertical are seconds.
The difference in lag time between the two versions is quite
interesting. I suspect that whatever caused the smoothing out in
cpu usage, may also be responsible for this behavior(?).