CLEANACCESS Archives

March 2006

CLEANACCESS@LISTSERV.MIAMIOH.EDU

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Jason Richardson <[log in to unmask]>
Reply To:
Perfigo SecureSmart and CleanMachines Discussion List <[log in to unmask]>
Date:
Mon, 20 Mar 2006 17:02:31 -0600
Content-Type:
text/plain
Parts/Attachments:
text/plain (96 lines)
Hi Raj, I really appreciate the quick reply.  Unfortunately, we spent
two hours on the phone with TAC and this never came up.  What they had
us do was a firmware upgrade of the BCOM NICs in the servers (we're
running MCS-7825-H1's with the BCOM 5702X NIC) which we just completed. 
We have not upgraded the firmware in the other two CASes or the CAMs yet
although they all have the same BCOM NIC.  When you say disable "OS
fingerprinting" do you mean uncheck both of the boxes - "Set client OS
to WINDOWS_ALL when Win32 platform is detected" and "Set Client OS to
WINDOWS_ALL when Windows TCP/IP stack is detected (Best Effort Match)?" 
This is a real bummer since this is the feature that we were looking
forward to implementing the most.

Thanks,

Jason

---
Jason Richardson
Manager, Security Systems
Enterprise Systems Support
Northern Illinois University

>>> [log in to unmask] 3/20/2006 4:21:38 PM >>>
Jason,

We have recently discovered an issue with the OS fingerprinting
feature
that can cause a kernel panic (machine hanging).  This issue is fixed
in
3.6.2 which should be released late tonight.  

To see if this is the issue affecting your machines, please turn off
the
OS detection feature on te machine that is crashing and  see if that
"fixes" the problem.  If that is the case, then I would recommend that
the OS fingerprinting feature be turned off until 3.6.2 is applied.
Note that this only happens in certain situations where there is a
client that deliberately sends certain null headers/mismatched TCP
headers.  Of course, when 3.6.2 is applied, you can turn the feature
back on. 

Jason, could you send the messages (you can send them to me offline)
that appear on the console at the time of kernel panic?  That will
help
better identify the root cause.

-Rajesh.

-----Original Message-----
From: Perfigo SecureSmart and CleanMachines Discussion List
[mailto:[log in to unmask]] On Behalf Of Jason Richardson
Sent: Monday, March 20, 2006 1:22 PM
To: [log in to unmask] 
Subject: CASes going down after upgrade to 3.6.1.1

Hi all, has anyone else had problems with their CASes after upgrading
to
3.6.x?  Our 2 CAMs and 4 CASes were running fine after our upgrade
last
week, but the students weren't back from break yet.  We came in this
morning to a trouble ticket from students reporting that they could
not
login.  Upon investigation we found one CAS totally unresponsive -
disconnected from the CAM and wouldn't respond to a ping or SSH.  We
literally had to power cycle it to get it back and that seemed to
resolve the problem.  This afternoon we got another trouble ticket
reporting the same problem and found another CAS in the same state
with
a message on the console of "kernel panic - not syncing, fatal
exception
in interrupt."  The interesting thing about the second one is that
when
we did the upgrade it took almost 2x as long as to install the OS as
the
others and 2x as long to reboot after we applied the 3.6.1.1 patch. 
The
first one that went down installed just fine, but also took a long
time
to come back after applying the .1 patch.

We're on the phone with TAC now, but we were just wondering whether
anyone else had had similar problems.

Simon, to answer your question, we completed the entire upgrade of 2
CAMs an4 CASes in under three hours and we considered it a total
success
until today.

Thanks,

---
Jason Richardson
Manager, Security Systems
Enterprise Systems Support
Northern Illinois University

ATOM RSS1 RSS2