[mythtv] [patch] duplicate checking using programid

Mon May 10 15:17:10 EDT 2004

David Engel wrote:
> On Sun, May 09, 2004 at 06:12:03PM -0700, Bruce Markey wrote:
> 
>>I had  brainstorm this morning (or was it a light drizzle?) that
>>SH%0000 are not dups but all other matching programids are dups.
>>This will do the right thing for generics. Specials should be
> 
> 
> What programid format will specials and other non-series programs
> have?

SHXXXXXX0000

>  In my program table, the only things that don't have a seriesid
> are movies 'MV%' and sports 'SP%'.

Correct.

> If only series programs use the 'SH%0000' format,

Unfortunately, no. That is the problem.

An episode of a series:

seriesid=SH123456 programid=EP1234560001

An undeclared episode of the same series:

seriesid=SH123456 programid=SH1234560000

A single episode special:

seriesid=SH654321 programid=SH1234560000

Notice that the generic episode of the series and the special
are in the same format. In fact, the only reason I can infer
that the second item is part of a series is that the same series
has an "EP%". I believe this is the reason they even have seriesid
so that you can know that "EP123456%" and "SH123456%" are the
same title and can be easily matched up. Ever here, I think they
still got it wrong. The series ID should be "EPXXXXXX" so we could
distinguish an EPisodic series from a "SHXXXXXX" single episode SHow.

> I can simplify the duplicate check by tightening the
> programid check to 'SH%0000' and drop the seriedid check.

That's correct. For the purposes of duplicate checking, the
seriesid does not need to be used.

I think you can drop the kDupCheckIdOnly also. If both items
have programids then the descriptive fields were filled in from
records of the exact same database table. Either the programids
match or they don't and any further string matching can only give
false results.

If one or both of the items does not have programid then the
string matching methods must be used.

Here's an untested example of what I think IsSameProgram should
be. The same logic would need to apply to the SQL for checking
recorded and oldrecorded.

bool ProgramInfo::IsSameProgram(const ProgramInfo& other) const
{
    if (title != other.title)
        return false;

    if (rectype == kFindOneRecord)
        return true;

    if (dupmethod & kDupCheckNone)
        return false;

    if (programid.contains(QRegExp("SH.*0000$")))
        return false;

    if (programid != "" && other.programid != "")
    {
        if (programid == other.programid)
            return true;
	else
            return false;
    }

    // if (dupmethod & kDupCheckIdOnly)
    //    return false;

    if ((dupmethod & kDupCheckSub) &&
        ((subtitle == "") ||
         (subtitle != other.subtitle)))
        return false;

    if ((dupmethod & kDupCheckDesc) &&
        ((description == "") ||
         (description != other.description)))
        return false;

    return true;
}

--  bjm

XTVDSchemaDefinition.pdf says:

Data Location: xtvd/schedules/schedule/@id

Value: Field length minimum 12, maximum 12 (e.g. MV1234560000,
SH0123450000)

Definition: Unique description identifier necessary to reference
movies, shows, episodes, sports from programs data. First two
digits are alphanumeric and correspond to movies (MV), shows (SH),
episodes (EP) and sports (SP).

For shows beginning with EP, the next 6 digits represent the series
ID, with the last 4 digits representing the episode id. If episode
information is not available, the program will appear as type SH,
the next 6 digits as the series id and the last 4 digits as zeros.