[mythtv] Storage Groups functionality

f-myth-users@media.mit.edu f-myth-users at media.mit.edu
Mon Feb 6 09:38:48 UTC 2006


    > Date: Sun, 5 Feb 2006 23:33:24 -0500 (EST)
    > From: "Chris Pinkham" <cpinkham at bc2va.org>
		      
    >		      Do you have any specific ideas on how archiving would
    > integrate with Storage Groups specifically?

I do, since this is something I'm looking at right this second; maybe
my solution will give people some ideas.  I'm (going to be) archiving
large amounts of data off Myth for later use in what amounts to a tape
robot implemented with DVDs.  The data's just being written as data
files, not mastered in some way that a standalone DVD player could
play, since we're not really "watching" the data in a conventional way.

One unsolved problem:  It'd be really, really nice to be able to take
an archived program (which has presumably been deleted from "recorded"
because it's been deleted from the UI) and reinsert all the relevant
table data again once the video bits have been restored to disk.  Is
it sufficient to simply save the relevant row from the "recorded"
table and reinsert it?  (I'm assuming that mythcommflag --rebuild will
also have to be rerun on the video to make seeking work again.)  Is
there anywhere else (e.g., other tables) that the info would need to
be reinserted?  I'm poking around trying to establish that right now,
but if somebody knows, it could save me some time.  [The idea is to be
able to run the commflagger & refind all the commercials, and
otherwise treat it as if it had just been recorded, rather than
relegating the file to second-class status in the "external videos"
part of Myth.]

Anyway:

My strategy is to add a column to the oldrecorded table, which stores
a sequence of DVD numbers corresponding to which DVD(s) each recording
got written to.  (Generally, this is only one, but large recordings
might get split across DVDs, and in general it might make sense to
split across DVDs so as not to waste some amount of space at the end
of each one.)  Since this isn't part of Myth's actual code, I'm
calling the column "_archived" in the assumption that no actual column
name will ever begin with an underscore.  The column's field type is
VARCHAR(128) NOT NULL and consists of a set of small integers,
corresponding to unique DVD numbers.  (Right now, that sequence of
numbers is generated externally from Myth, and the data is inserted
into the table by the script that burns the DVDs; if at some point I
might need to backtranslate from DVD number to what's supposed to be
on it, I'll make another table indexed by DVD number instead [but then
I'd have to have some unambiguous way of referring to individual
programs, hence my earlier question of whether Myth had such a way;
I'll probably wind up using channum + starttime to do so].  Such a
table would use autoincrement so that the mysql DB is responsible for
ensuring that we don't duplicate a DVD sequence number, but for the
moment I can just treat the contents of _archived as a set of ints
that will be stored and parsed by a very simple external script.  [If
there was some plausible way to tell mysql "set of ints" I'd do that
instead of storing a bunch of ints as a single character string in the
varchar, but this'll work for now.])

People who actually -author- DVDs and try to make sure content
actually fits on a single DVD might use such a column with a real
int (probably smallint) datatype instead, but OTOH such people might
be calling their DVDs by titles ("My Vacation") instead, so a varchar
might work for them, too.

(It'd also be nice if "oldrecorded" had the same "syndicatedepisodenumber"
column that "recordedprogram" did!  Right now I'm having to kluge that sort
of thing after the fact...)

P.S.  A random detail:

I've also added a second column called "_par".  Every 5 DVDs we burn
has an accompanying PAR2 archive consisting of enough recovery blocks
to completely recover any one DVD's contents.  This gives us some
resiliency against a scratch taking out a file or some worse insult
taking out an entire DVD (including, of course, simply -losing- the
entire DVD... :)  Note that calculating a PAR2 archive across the
~23gig of data involved is fairly slow; the sweet spot on filesystem
buffering vs PAR2 memory-handling turns out to be "par2 -m64" on this
hardware (AMD 2800+) and generating the data takes about six hours.
This inflates the number of burned DVD's by 20%, but eliminates a
large number of possible problems with archiving this data for perhaps
years, and the cost of using DVDs this way is around 10-20% of what
using spinning hard disks would be for this enormous amount of data.

[Soon there will be another column called "_cc" (type text) which will
hold the closed-captioning data for the program, if any, to enable
searching it for programs mentioning certain terms.  Obviously, this
entire scheme depends on -nothing- in Myth itself ever deleting
something from "oldrecorded"; if such a thing can happen, I'll have
to make an entirely separate table so Myth keeps its mitts off it.]


More information about the mythtv-dev mailing list