On CentOS 5.2 I found out that the audit daemon will crash when put under heavy load with it's default settings. I was using the CIS Benchmarks to help lock down some machines and made some of the changes they suggest for auditd. After making the changes I ran one of the scripts to find files that did not belong to any user or group currently on the system. Here is that line.
for PART in $(grep -v '^#' /etc/fstab | awk '( $3 ~ "ext[23]" ) { print $2 }' ); do find $PART -xdev -nouser -o -nogroup -print ; done
About halfway through the find the auditd would die with these messages in the syslog:
auditd[6135]: Cannot allocate audit reply
auditd[6135]: Cannot allocate audit reply, exiting
auditd[6135]: Error receiving audit netlink packet (No buffer space available)
auditd[6135]: Error setting audit daemon pid (No buffer space available)
The netlink error was because of the messages going to a syslog server getting cut off. That still did not explain why it died in the first place. After much googling of the error I found nothing. The only time I saw this error was from the source code of the original program. You know it's bad when that's the only spot the error shows up in.
After reading the man pages a few times for auditctl and bumping up the buffers (-b) to a massive number nothing would keep it from crashing on the find. Finally I saw the -r setting which is for rate limiting the messages. The -r setting in audit.rules file will set the limit in messages/sec. If the rate is non-zero and is exceeded, the failure flag is consulted by the kernel for action. The failure action could be do nothing, print a message, or just panic. Default panic action is to print a message. The default setting for rate limit is 0 which means no limit. With no limit the find must have been killing auditd. Setting the limit to 21000 (-r 21000) in the /etc/audit/audit.rules fixed it.
Just a warning. I believe this will keep some things from logging when it hits it's rate limit. So if your environment must have all audit logs then this might not be for you. The rate of 21000 is what I found to be high enough on my system so it would not crash auditd. You should experiment with bumping the limit higher if you can.