Thanks for the detailed response.<div><br><div class="gmail_quote">On Sat, Oct 1, 2011 at 8:16 AM, Raymond Wagner <span dir="ltr"><<a href="mailto:raymond@wagnerrp.com">raymond@wagnerrp.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
<div class="im">On 10/1/2011 01:12, Daniel Osborne wrote:<br>
> Is there a way to ignore punctuation in the metadata grabber?<br>
><br>
> For example, I have my movies organized as such:<br>
> Star Trek - First Contact.mkv<br>
> Star Trek - Generations.mkv<br>
<br>
</div>Correct the punctuation and use colons instead.<br>
<div class="im"><br>
> Now, the actual movie contains a colon (see:<br>
> <a href="http://www.themoviedb.org/movie/199" target="_blank" class="vt-p">http://www.themoviedb.org/movie/199</a>), but I obviously can't use that<br>
> as a character (interoperability with Windows).<br>
<br>
</div>Sure you can. Use the 'mangled map' operator to have samba convert<br>
between the two.<br>
<div class="im"><br></div></blockquote><div>My interoperability with Windows really means I do a lot of file management through it. How would name mangling work if I wanted to rename the file in Windows? I know it won't let me type a colon, but could I reverse-map something to a colon?</div>
<div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;"><div class="im">
> I know that I can manually edit the title metadata to remove the<br>
> hyphen then refetch, and it successfully works. However, I'd like the<br>
> ability to ignore punctuation in Mythvideo automatically.<br>
<br>
</div>That would be the incorrect way to do things, as many movies actually<br>
have punctuation in their titles, including hyphens. Take Wall-E for<br>
example. "Wall-E" and "Wall·E" work fine with TMDb. Meanwhile, "WallE"<br>
returns a movie called "Walled In", and "Wall E" simply faults.<br>
<br></blockquote><div>In your example, "Wall E" works for me (at least from the tmdb script portion), not sure if Myth itself has a problem though.</div><div>/usr/share/mythtv/metadata/Movie/tmdb.py -l en -M "Wall E"</div>
<div><div><?xml version='1.0' encoding='UTF-8'?></div><div><metadata></div><div> <item></div><div> <language>en</language></div><div> <title>Wall·E</title></div>
<div> <inetref>10681</inetref></div></div><div>blah, blah, blah...</div><div><br></div><div>I would agree that WallE would (and should) fail, my suggestion wouldn't be as simple as removing any and all punctuation.</div>
<div>In my head (too bad you can't read minds <img src="cid:gtalk.347@goomoji.gmail" style="margin-top: 0px; margin-right: 0.2ex; margin-bottom: 0px; margin-left: 0.2ex; vertical-align: middle; " goomoji="gtalk.347">), I was thinking it would try both methods, since you are correct, TMDb doesn't consider the hyphen close to a colon, then pick the closest match from there.</div>
<div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
MythTV filters results based off Levenshtein distance, and distance<br>
between your title and the correct title is two. By default, MythTV<br>
filters anything above five, so as far as MythTV is concerned, it's a<br>
valid match. The tmdb.py script just passes the string onto the API,<br>
and lets the web API deal with it as it chooses. The problem is the<br>
TMDb API does not think a ": " and " - " are sufficiently close to<br>
return a match.<br>
<div class="im"><br>
> If the devs are interested in a patch, I could write up another one.<br>
<br>
</div>Since your proposed solution would solve your specific problem, but in<br>
the process cause others, it is not something we could accept.<br>
<div><div></div><div class="h5">_______________________________________________<br>
mythtv-users mailing list<br>
<a href="mailto:mythtv-users@mythtv.org" class="vt-p">mythtv-users@mythtv.org</a><br>
<a href="http://www.mythtv.org/mailman/listinfo/mythtv-users" target="_blank" class="vt-p">http://www.mythtv.org/mailman/listinfo/mythtv-users</a><br>
</div></div></blockquote></div><div><br></div><div><br><br><div class="gmail_quote">On Sat, Oct 1, 2011 at 8:46 AM, Raymond Wagner <span dir="ltr"><<a href="mailto:raymond@wagnerrp.com">raymond@wagnerrp.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0.8ex; border-left-width: 1px; border-left-color: rgb(204, 204, 204); border-left-style: solid; padding-left: 1ex; ">
<div class="im">On 10/1/2011 05:15, Daniel Osborne wrote:<br>> A little off topic though, I'd also like it to be able to take the<br>> year out of the filename (if any), and use that to match a specific<br>> remake. For example:<br>
> "Alice in Wonderland (2010)"<br><br></div>Take a look at the code that produces movie titles.<br><br><a href="https://github.com/MythTV/mythtv/blob/master/mythtv/libs/libmythmetadata/videometadata.cpp#L1036" target="_blank" class="vt-p">https://github.com/MythTV/mythtv/blob/master/mythtv/libs/libmythmetadata/videometadata.cpp#L1036</a><br>
<br>Right now, it simply truncates anything within any form of braces.<br>While I don't have the final say on parsing formats, something that<br>handled the specific regular expression "\((0-9){4}\)$" to parse out<br>
four digit years to allow later filtering of the results would likely be<br>acceptable. Note that due to how that function operates, with a<br>position response of "0" telling it to stop looping back through, the<br>
interface would have to be altered to support new values.<br><div><div></div><div class="h5"><br></div></div></blockquote><div>If that's an option, then I may dig into that code and see what I can do there.</div><div>
<br></div><div>Thanks!</div></div></div></div>