With the first 72 hours running against delinquent accounts, the new routing engine performed remarkably well. In about an hour we’ll do a wide deployment across the entire network. It will take approximately 3 hours for the upgrades to be performed.
The good news:
- Performance was off the charts. Always good when rolling out new software.
- We were never the bottleneck. This was a significant area of doubt, even though we had all the numbers in the world, we never had a proven model over time. Here is essentially how it works: messages are flushed to a queue and delivered sequentially from each processing node. As the mail load increases throughout the day, is it due to multiple ExchangeDefender connections or is it due to the saturation of the link? Good news is, it’s not us. When tested over 300+ IP’s worldwide, when certain links showed slowdown, others went through just fine.
- Network congestion or server overload? This is something we are generally not alerted to and something VARs rarely either know how to access or have permissions to view. Exchange 2007 does issue performance based errors but your weakling consumer firewalls do not – they just defer the connection or drop it outright.
- DNS issues? This one was fun.. we pretty much DDoS’ed people 🙂 We found hosts who took forever to issue a banner – so we flooded them with SMTP connections. Then we started transfering 256K attachments, then 1 Mb attachments. Guess what? They flew!!! We are narrowing this down to two effects: 1) Problems with DNS servers. 2) Excessive RDNS or RBL lookups.
Problems with DNS servers are more difficult to isolate because they may be sporadic depending on their load. As most sites are not likely to run their own name servers or their own caching name servers, external lookups may take longer. Sites that ALWAYS had terrible initial greeting are very likely just using dead RBLs or way too many antispam measures – ALL of which need to be shut off.
So far we’re looking at the healthiest week on the network, despite DDoS and attacks as ususal. Let’s see what we can pull up when the entire network is actively managing connectivity to target servers.