[mythtv-users] NFS suggestions under heavy load?
f-myth-users at media.mit.edu
f-myth-users at media.mit.edu
Mon Jan 9 07:14:43 UTC 2006
I suspect that the answer to this question is, "Don't do that, then,"
but in case anyone has any suggestions:
I was doing a load test. Configuration is 18.1, with an MBE w/5 250's
and an SBE that NFS-mounts the MBE's recording directory, with 1 350.
CPU's are AMD 2800+'s, 512MB RAM, 100baseT ethernet through a hub;
disks are Seagate 7200.7 200GB PATA (one per CPU); filesystem for the
non-recordings is ext3fs. (Recordings use JFS.) I set up test
recordings on all 6 tuners simultaneously, then started a dirvish
backup of the entire MBE's filesystem, -excluding- the actual video
directory and things like /dev and /proc and so forth; I was doing
this in part to set up the initial dirvish vault for keeping the
machine backed up. (For those who've never heard of dirvish, this
amounts to running rsync on the entire filesystem and schlepping the
results to a third machine.)
About 1 minute into the transfer (and about 6 minutes into the
recording), the recording on the 350 (e.g., the SBE machine) got
corrupted for a few seconds that threw off its audio sync throughout
the rest of the recording and then crashed the 350 itself when I
played back that particular stream a few hours later (more about that
in a message to ivtv-users). The slave logged three complaints across
300ms in its kernel log that the master's NFS server timed out while
it was making that recording, so I'm hardly surprised that the
recording had problems. Interestingly, it only glitched that one
time; I'm not sure why.
I have real-time commflagging turned on, and the test recordings
started about 5 minutes before the rsync, and most of the commflaggers
run on the slave, so that added to both disk and network contention.
I suppose in retrospect it's not surprising that something hiccuped,
and that -was- the point of the test... (Though OTOH the commflagger
waits several minutes [exactly 5? more?] before starting, so maybe it
wasn't that. OTGH, I have a 10s job-queue check interval, so 5 of
them could have started within a 50-second window once enough
recording had accumulated.)
The question is, can I do better? I don't imagine I'll often be
rsyncing the entire disk, but I will certainly be rsyncing the delta
since the last rsync, and I'd rather not have to worry about that
disrupting recordings. (That's a bit harder to test under load, since
the deltas will vary, but I'll attempt it to ensure that rsync's
scanning phase isn't enough to cause disruption.) Current NFS mount
options are rsize and wsize of 8192; would increasing these help on a
100baseT hub, or just lead to more fragment reassembly? (Actually,
the hub itself is an SMC 8508T gigabit hub with jumbo packet support,
but the NICs on the CPUs are only 100 megabit.) Other NFS options are
soft & nfsvers=3, which are pretty standard.
Would increasing ivtv's buffer sizes on the slave help? Or would I be
screwed anyway 'cause the buffers are already flushed by the time the
NFS timeout manifests?
P.S. Is it feasible to let the slave record on its own filesystem
instead of on the NFS-mounted master's? I suspect that there's no way
to seamlessly be able to play, commflag, or transcode if recordings
might be split across multiple filesystems, but if somebody has a good
idea, I'm willing to listen...
More information about the mythtv-users