That old chestnut – the weird ace log issue – now has a bug ID ! (CSCso15332). It only appears to happen if the rservers are offline in the original context. Bizarre.
Here is what Cisco has said on the matter:
The customer was using the same syslog host in two contexts, had configured http and https probes in both contexts, and was logging probe failures in both contexts. The rservers in the second context were unavailable when the issue began; later, those rservers became available and the inappropriate logging ceased.
Further Problem Description:
The customer experienced this problem for approximately ten days sproadically on one rserver. Even during that period there were at most a moderate (20 to 30) number of messages a day.
Had some minor drama with the core switch clock being out by well over an hour. Not causing any issue except frustration when viewing logs and trying to compare timestamps across devices.
All the network devices get their time sync from the wan router (which in turn contacts our ntp server for its time).
To fix it up – remove all the ntp related statements from the cisco 6509, then manually set the clock to within 5 mins of the actual time as reported by the ntp server, then re-apply the ‘ntp server’ statement(s). After a short while the clock will re-sync correctly with the ntp server specified.
Never apply the ‘ntp clock-period’ statement – this is done by the device itself. I think someone manually applied this and that was why the time started to get way out.
ie. Make sure you never copy the ‘ntp clock-period’ statement from one device to another. The clock-period is set by NTP; to regulate the internal clock on the device. This is the number of ticks in one second. Therefore, NTP doesn’t change the time of the switch, it adjusts the clock-period so that the system clock is synchronized with the ntp server.
I’ve sent more info to the TAC re that ACE logging issue – which has not shown up again for a few weeks. It stopped suddenly after the production rservers in the ‘correct’ context were finally started up (the context was prepared and active prior to server deployment). In any case, the TAC has escalated it to the engineering team for analysis. I agreed with one of Cisco’s engineers that said “this shouldn’t be possible”. Love it :)
We deployed a Cisco WAAS appliance to the Brisbane office – all seemed ok however the voice team said ‘at some stage after the WAAS was turned on’ they had messages-on-hold errors. Strange given that WAN optimisers ignore udp. Because the deployment is using wccp – it is likely something has gone wrong with the router – or it could be unrelated altogether.
Also, I learned a valuable lesson the other night – never gloss over the ‘clear xlate’ command on a FWSM. When messing around with static nats, hours can be saved troubleshooting by using that command … bah.