[Dirvish] Dirvish related boot slowdowns (was: My recent Dirvish list post attempt)
keithl at kl-ic.com
Tue Feb 21 23:49:40 UTC 2006
On Tue, Feb 21, 2006 at 02:11:35PM -0800, Eric Hall wrote:
> I have a broken Debian Sarge box whose troubleshooting attempts are
> crippled by what appears to be a consistency check on boot to the
> dirvish_vault. The check takes about 45 minutes. I would like to
> disable so I can get some quicker boot cycles to try and identify the
> problem. I cannot for the life of me find where this routine is
> starting. There is nothing I can find in the rcx.d folders, the
> /etc/dirvish folder is empty, and I have moved all the dirvish related
> perl scripts from the /sbin folder. I have also looked in the
> /etc/fstab to see if there was something there that dirvish installed to
> be mounted at boot. Nothing. I am stumped. How do I disable this?
I can guess what is happening. You may be seeing an fsck - a file
system check - of a disk partition used for storing a dirvish bank.
If you are debugging, you probably have some unclean shutdowns which
trigger fsck's for all mounted partitions on reboot.
The dirvish partition is INCREDIBLY crosslinked - and fsck tools are not
very optimized for checking such things, and take a LONG time. An fsck
of my dirvish partition takes more than three days - I've never let it
run longer than that, so I don't know how much longer it would need to
finish, probably weeks.
I hope you have built your dirvish bank in its own exclusive partition.
That allows you to modify the fstab so that it does not automatically
mount at boot time. The easiest thing is just to comment that partition
out of the /etc/fstab table. It is better to leave the dirvish drive
unmounted anyway, and only mount it during the runscript you have
wrapped around dirvish-runall.
If the dirvish bank got put on a partition with other essential stuff,
I suggest that you find some way to transfer that essential stuff onto
some other partition with a "cp -a". Then rename the dirvish partition,
and then let dirvish own it after that. This is a lot easier if the
drive being worked on is mounted as a second drive on a system booted
from a different drive (like a Knoppix CD).
Tangentially, I had to do something very much like that with my
wife's laptop drive a few days ago. There was some weird ext3 journal
corruption making system errors on the root file system "/" that would
cause it to drop into "read only" mode. I eventually ended up copying
all of / onto another partition, blowing away the / partition on that
drive, then doing some badblocks scans followed by a mkfs to build
a virgin file system. I then copied everything back with a similar
cp -a, then re-initialized tripwire (old system, don't ask) so it would
not complain about all the zillions of symlinks that got new dates. That
made all those system errors go away, and the machine is doing fine now.
Up in Anacortes, you probably got all the same wind that we got here
in Portland, with brief power interruptions causing RAM corruption and
some junky stuff written to the disk. The good news is that you can
probably do a pretty good wipe and rebuild using your dirvish backups,
though I would suggest doing it to a different target drive.
If that doesn't help, I hope it at least causes some fruitful lateral
thinking. Good luck, and see you at Linuxfest NW!
Keith Lofstrom keithl at keithl.com Voice (503)-520-1993
KLIC --- Keith Lofstrom Integrated Circuits --- "Your Ideas in Silicon"
Design Contracting in Bipolar and CMOS - Analog, Digital, and Scan ICs
More information about the Dirvish