[ale] Little OT: Bad Linux Sysadmin Practices

Jay Lozier jslozier at gmail.com
Fri Oct 12 10:03:15 EDT 2012


On 10/12/2012 09:30 AM, Lightner, Jeff wrote:
> I worked for a company that had a large data center in Little Rock which had both the telecom systems and the banking/mortgage systems (mainframe and open systems) for multiple telecom, banking and finance companies (at the time 20% of mortgages in the US went through these systems).
>
> On two separate occasions while doing supposedly "planned" maintenance they took down the whole data center accidentally.   In one of these events they had poorly wired one of the breaker cabinets and caused a short which made it fail.  No problem because they had brilliantly set up the power so that failures from one cabinet would shunt the load to the next one.  Oops - there's still a short so now we've fried that next cabinet.  No problem because it fails over to another one etc...  - OH WAIT....!
>
> It is truly gratifying when there is a major production outage that you as a sysadmin could NOT have prevented and do NOT have to be involved in resolving (at least until it comes time to power everything back up).
>
> At that same company a co-worker put together a very good backup policy on our first implementation of NetBackup but when they saw how much it was going to cost for all the tapes required to do the various retention levels management balked.   They nixed his plan and said they'd never need a backup more than 6 months old and "saved money" by not buying as many tapes and adjusting the retention policies.    8 months later when they asked for a backup from the first month due to a critical issue they asked why we didn't have it.  When we told them (and even showed to them in writing) how they had said they would never need more than 6 months did they accept responsibility for poor decision?  They did not - They instead said that the admins were not forceful enough in trying to convince them of the need for the original plan.    Sometimes you just can't win.
>
Did management have pointy hair? It sounds like they let the bean
counters determine all policy whether there was overriding
technical/commercial issue involved. I wonder what the contractual
obligations were to the end users? The contracts probably specified
minimum backups required and that at a minimum should have driven the
backup policy. From your comments it sounds like PHB's ignored the
contractual obligations (lawsuit for breach of contract)

<snip>


-- 
Jay Lozier
jslozier at gmail.com



More information about the Ale mailing list