Subject: | |
From: | |
Reply To: | |
Date: | Thu, 21 Jul 2005 14:18:53 -0400 |
Content-Type: | text/plain |
Parts/Attachments: |
|
|
So while we are on the subject of failover. What are your thoughts on
failover for the manager? We have not implemented this rather we have a
"warm" spare ready to take over if needed.
Thanks for the input....
Brad
PS: Perfigo v3.2.13 (Yeah I know we are WAY behind, but it works....)
On 7/21/05 2:10 PM, "Aaron Havens" <[log in to unmask]> wrote:
> Eric Weakland wrote:
>>
>> All,
>>
>> First of all - thanks to all of you on this list, it is a great resource
>> for us here at American.
>>
>> I wanted to let anyone out there who is going to be implementing
>> failover bundles know of a rather alarming series of events that
>> happened to us yesterday. We have Clean Access high availability
>> bundles implemented doing vlan retagging using both the ethernet
>> interfaces. In the documentation it says that you can "optionally" use
>> serial cables to also send the heartbeat information for failover, but
>> that the heartbeats will always be sent by eth1 from one server to
>> another in a pair. Here is a quote: "The serial connection essentially
>> provides an additional method of heartbeat exchange that must fail
>> before the standby system can take over. Note however that only the eth1
>> connection between the peers is mandatory." from
>> http://www.cisco.com/application/pdf/en/us/guest/products/ps6128/c1616/ccmigr
>> ation_09186a00803e0969.pdf
>>
>>
>> The short story is that in our experience you should not just depend on
>> eth1. Our team implemented failover on Monday and on Wednesday, all of
>> a sudden our core router cpu utilization (on 3 cat6500 series msfc2's)
>> went to 99% when they were usually at 15% max. Clients couldn't get
>> DHCP addresses, couldn't get to the internet, etc. It looked exactly
>> like a DDOS attack and that was what TAC told us to look for and made
>> sure we had the commands to track the DOSers down. We realized soon
>> after that, however, that all of our full input queues were on segments
>> that had been migrated to CCA. A little more empirical testing revealed
>> if we shut down just the CCA standby servers - all was well. So
>> essentially the servers were fighting for the primary role and hammering
>> the routers with ARP storms.
>>
>> We will be implementing Serial Failover cables before turning the
>> failover boxes back on.
>>
>> Cheers!
>>
>> Eric Weakland
>> CNE, CISSP
>> Director, Network Security
>> Office of Information Technology (IT)
>> American University
>> eric(at)american.edu
>
> Thank you for that. I am about to finally get my Failover servers setup
> today. I will make sure I use a serial cable also.
-----------------------------------
Bradford B. Saul
Lead Network Engineer
IT - Network Engineering
Hoffman Hall Room 10
MSC 0601
James Madison University
Harrisonburg, VA 22807
V: (540) 568-2379
F: (540) 568-1696
M: (540) 435-3079
[log in to unmask]
|
|
|