[Dirvish] backup + nagios

Paul Slootman paul at debian.org
Sun Aug 21 21:04:48 UTC 2005

On Sun 21 Aug 2005, Jason Boxman wrote:
> On Sunday 21 August 2005 16:11, Mateusz Pospieszny wrote:

> > backup partition as a software RAID 1, I've wrote the attached 2 scripts
> > to get nagios http://www.nagios.org/ to monitor the completion of the
> > daily backups.
> Excellent idea.  I have had the same thing happen to me once.  Fortunately I 
> discovered why I had no backups before I needed them and started the process 
> again.

I'll have a look at applying those scripts at work, could be useful.

There are also other things to monitor, which aren't as obvious.
Currently there's also a Tivoli Storage Manager tape backup running off
the backup systems, and that eats a lot of memory (I suspect a bad
memory leak in the client). Once the linux kernel ran out of memory, and
picked cron as the sacrificial goat to the memory gods. Unfortunately we
only noticed after we needed a backup and the last one was 3 weeks
old... Since then we let nagios also check that cron is running :-)

> <snip>
> > P.S. I have about 52GB of combined data in dirvish, keeping the last 2
> > weeks worth of daily backups, then keeping one backup per month after
> > that. This is from 4 servers, two of them storing a large amount of
> > small files (Maildir format Inboxes) This seems to generate a lot or
> > usage on the backup drives. I've already got two 200GB IDE drives die on
> > me. Thankfully my backup partition is on a mirrored software raid now.
> Yeah, I lost a Maxtor 120GB myself and then its RMA'd replacement.  None of my 
> WDs or Seagates have failed, though.

I've sworn off Maxtors as well. I'm currently a WD fan, although Hitachi
(ex-IBM) is looking good as well. I'm on the fence with Seagate :)
We now have a nice storage system to play with at work, with 24 hotswap
SATA disks (400GB WDs), connected to 2 3ware raid controllers. That's to
replace the tape backup, which is currently our long-term backup.

I'm looking at hacking into dirvish-expire, to move the to-be-expired
images to this system instead of removing it outright. Ideally a
dirvish-like structure would appear on the destination as well, to
minimize storage requirements. Having a month or two of backups would be

Paul Slootman

More information about the Dirvish mailing list