[ale] Samba: file corruption on write to share followed by hang

Michael H. Warfield mhw at WittsEnd.com
Tue Dec 15 10:01:33 EST 2009


On Tue, 2009-12-15 at 00:00 -0500, Jeff Hubbs wrote:
> OK, but being ECC RAM, wouldn't something have shown up in 
> /var/log/kernel?  How could I tell other than using FSM-style faith?

	I don't believe there's a specific interrupt or error upon memory
parity or ECC failure.  I think it generates an NMI (Non Maskable
Interrupt) but a lot of things could generate that error (Error:
Unexpected NMI. Dazed and confused but trying to continue anyways).  I
don't know if there's an indication in a memory controller somewhere or
not about that.  Might depend on your hardware.  Obviously, once you
take a non-recoverable memory hit, everything becomes suspect.

> Jim Kinney wrote:
> > Bad ECC RAM is still bad RAM. ECC can only correct a single bit flip 
> > in register. 2 bit flips and it's all toast.
> >
> > It does sound like Samba managed to totally corrupt itself and the 
> > hang later may have been related to the system thrashing ram around. 
> > The filesystem definitions are kernel space so samba has to access 
> > that to function. Just be restarting samba is a pretty good indication 
> > that it was memory associated with the samba process. The aggressive 
> > caching of the kernel will amplify a bad memory situation. Restarting 
> > samba will cause teh samba caching to also restart and that may have 
> > overwritten the bad data portion which was related to the filesystem 
> > management area.

	Mike

-- 
Michael H. Warfield (AI4NB) | (770) 985-6132 |  mhw at WittsEnd.com
   /\/\|=mhw=|\/\/          | (678) 463-0932 |  http://www.wittsend.com/mhw/
   NIC whois: MHW9          | An optimist believes we live in the best of all
 PGP Key: 0x674627FF        | possible worlds.  A pessimist is sure of it!
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 482 bytes
Desc: This is a digitally signed message part
Url : http://mail.ale.org/pipermail/ale/attachments/20091215/42d90727/attachment.bin 


More information about the Ale mailing list