Difference between revisions of "Create SRT Subtitles From MythTV Recordings"

From MythTV Official Wiki
Jump to: navigation, search
m (Added ccextractor to subtitle tools list. No detailed instructions yet.)
(Added some ccextractor notes)
Line 1: Line 1:
[[Image:Important.png]] '''Note: the following details steps involving Windows only programs. (Except ccextractor)'''
+
[[Image:Important.png]] '''Note: the following details steps involving Windows only programs. (Except ccextractor which is available in linux, mac and windows. See below.)'''
  
 
If you receive digital DVB broadcasts (such as with a UK freeview type TV card) and are able to view subtitles in MythTV for your recordings, then you are also able to create many different types of stand alone subtitle files from these (most notably .srt files that can be read by many DVD an PC players). This can be useful if you wish to transcode your recordings and don't keep the original MythTV recorded file. Unfortunately it takes a little work and perseverance to manage this.
 
If you receive digital DVB broadcasts (such as with a UK freeview type TV card) and are able to view subtitles in MythTV for your recordings, then you are also able to create many different types of stand alone subtitle files from these (most notably .srt files that can be read by many DVD an PC players). This can be useful if you wish to transcode your recordings and don't keep the original MythTV recorded file. Unfortunately it takes a little work and perseverance to manage this.
Line 10: Line 10:
 
*[http://www.divx-digest.com/software/subrip.html SubRip] (Win)
 
*[http://www.divx-digest.com/software/subrip.html SubRip] (Win)
 
*[http://www.urusoft.net/download.php?lang=1&id=sw Subtitle Workshop] (Win)
 
*[http://www.urusoft.net/download.php?lang=1&id=sw Subtitle Workshop] (Win)
*[http://ccextractor.sourceforge.net/ ccextractor] (Win / Lin / Mac)  (Use -myth -srt options)
+
*[http://ccextractor.sourceforge.net/ ccextractor] (Win / Lin / Mac)   
  
  
 
'''1.''' First, you need to find the file for your recording. This will have been recorded to the location you chose during MythTV setup. It will have a name something like "1002_20061008220300.mpg" but with the correct date and time. The best way to find it is to just watch through some of the latest recordings until you find the correct one.
 
'''1.''' First, you need to find the file for your recording. This will have been recorded to the location you chose during MythTV setup. It will have a name something like "1002_20061008220300.mpg" but with the correct date and time. The best way to find it is to just watch through some of the latest recordings until you find the correct one.
  
 +
'''2.''' (only with ccextractor). Generally speaking, ccextractor does a fairly good job detecting file formats, and usually it doesn't need a lot of help. Basically, you just do this:
  
'''2.''' You now need to de-mux the recording (separate the different parts) using a program called ProjectX. Use the [http://www.videohelp.com/tools?tool=ProjectX ProjectX compiled with Java] version if using Windows. Now, run ProjectX, hit File > Add and add the .mpg recording of the show you want subtitles from. Now you should just be able to click the QuickStart button and wait for it to finish. If that doesn't work, click the "prepare >>" button, hit the button with a picture of an "i" and look through the info for the code that is next to the Subtitles heading, something like 0x255. Now, in the main window on the left you'll see a list of 3 or 4 similar codes. Remove every code other than the subtitle one from it by double clicking on these codes. Now click the QuickStart button again.  
+
ccextractor -srt inputfile 
 +
 
 +
And get a srt (subtitle text file in subrip format).
 +
 
 +
There are a lot of options you can use to modify output (fix case to prevent all caps, timing, etc). Run ccextractor without parameters for complete help.
 +
Finally, be aware that closed captions are not stored in the same format by all cards. ccextractor is known to work well with digital captures (transport streams), analog captures (in fact this is based on MythTV's code) and a few others. ccextractor is a work in progress though.
 +
 
 +
 
 +
'''2.''' (without ccextractor). You now need to de-mux the recording (separate the different parts) using a program called ProjectX. Use the [http://www.videohelp.com/tools?tool=ProjectX ProjectX compiled with Java] version if using Windows. Now, run ProjectX, hit File > Add and add the .mpg recording of the show you want subtitles from. Now you should just be able to click the QuickStart button and wait for it to finish. If that doesn't work, click the "prepare >>" button, hit the button with a picture of an "i" and look through the info for the code that is next to the Subtitles heading, something like 0x255. Now, in the main window on the left you'll see a list of 3 or 4 similar codes. Remove every code other than the subtitle one from it by double clicking on these codes. Now click the QuickStart button again.  
  
  

Revision as of 18:12, 2 July 2007

Important.png Note: the following details steps involving Windows only programs. (Except ccextractor which is available in linux, mac and windows. See below.)

If you receive digital DVB broadcasts (such as with a UK freeview type TV card) and are able to view subtitles in MythTV for your recordings, then you are also able to create many different types of stand alone subtitle files from these (most notably .srt files that can be read by many DVD an PC players). This can be useful if you wish to transcode your recordings and don't keep the original MythTV recorded file. Unfortunately it takes a little work and perseverance to manage this.


Package.png Tools:


1. First, you need to find the file for your recording. This will have been recorded to the location you chose during MythTV setup. It will have a name something like "1002_20061008220300.mpg" but with the correct date and time. The best way to find it is to just watch through some of the latest recordings until you find the correct one.

2. (only with ccextractor). Generally speaking, ccextractor does a fairly good job detecting file formats, and usually it doesn't need a lot of help. Basically, you just do this:

ccextractor -srt inputfile

And get a srt (subtitle text file in subrip format).

There are a lot of options you can use to modify output (fix case to prevent all caps, timing, etc). Run ccextractor without parameters for complete help. Finally, be aware that closed captions are not stored in the same format by all cards. ccextractor is known to work well with digital captures (transport streams), analog captures (in fact this is based on MythTV's code) and a few others. ccextractor is a work in progress though.


2. (without ccextractor). You now need to de-mux the recording (separate the different parts) using a program called ProjectX. Use the ProjectX compiled with Java version if using Windows. Now, run ProjectX, hit File > Add and add the .mpg recording of the show you want subtitles from. Now you should just be able to click the QuickStart button and wait for it to finish. If that doesn't work, click the "prepare >>" button, hit the button with a picture of an "i" and look through the info for the code that is next to the Subtitles heading, something like 0x255. Now, in the main window on the left you'll see a list of 3 or 4 similar codes. Remove every code other than the subtitle one from it by double clicking on these codes. Now click the QuickStart button again.


3. The end result should be, among other things, a file called "1002_20061008220300.sup" or similar in the same directory that the original recording is stored in. You can safely delete all the other newly created files, but keep the .sup file - this contains the subtitles.


4. Now, switch over to Windows if you haven't already. You need to use a program called DVD Subtitle Decoder that comes with a package called DVD Subtitle Tools. Download the latest version and save it somewhere on your PC. Now, add your .sup file to that same directory. You now need to open a command prompt, browse to the directory with the DVD Subtitle Tools in, and run

DVDSupDecode.exe -bitmap -pal 1002_20061008220300.sup

Change -pal to -ntsc if you need to, and use the correct name of the .sup file.


5. You will now have a huge amount of .bmp files and a 1002_20061008220300.txt (or similar) file. Put all of these in their own folder which you can then delete later. You can delete the ...sup file now if you wish.


6. Run SubRip and click File > Open Image Sequence and chose the 1002_20061008220300.txt (or similar) file created by the DVDSupDecode program. The program will start asking you what the characters it encounters in the subtitles are. To start with, this is the really slow bit. If you've never used SubRip before, you'll need to define every single character it encounters, and many a number of times due to the different colouring used in DVB subtitles making the characters look slightly different. After a while it will start recognizing characters correctly and you won't have to do anything. Some it will ask you to confirm a best guess for. Some it will ask you to write the whole sentence for. Just keep with it and try not to enter anything wrongly.

Basically, it will feature a "best guess" at the top of the page. If this corresponds to the character or characters highlighted in red then click "use". If not, enter the correct character or characters in the "Fill in (these) character(s):" box exactly as they appear in the image above, and click OK. Repeat god knows how many times.


7. Once SubRip has analyzed the whole file, hit File > Save As. Hit the Save button (accepting default options), and save the file with the same name in the same folder as your transcoded MythTV recording. If it complains about non-unicode characters, choose to save it as a standard unicode file.


8. Now, as you don't wanna go through every other recording and have to type every single character in, you can save your progress of assigning characters to visual shapes in SubRip with the Character Matrix > Save Character Matrix File As option. Next time, open this and it'll make life a lot easier. You can also edit it if you find that it's done dumb stuff like set 0 as O or O as 0.


9. Possibly the most tedious of parts is getting the subtitles synced with the audio properly. You may wish to watch the original recording in MythTV with subs turned on first. This way you can make note of a few key phrases that appear at a noticeable point in the recording, such as "Blah Blah line 1" appearing at the exact time a the frame changes from the main titles to a shot of the main character, or whatever. Either that or you can just sync them however seems to fit best.

So, to Subtitle Workshop we go. Click File > Load Subtitle and open up the .srt subtitle you just saved. If it's in the same directory with the same name as the video file, that should open up too.

If you cut out commercials from the original recording then unfortunately these won't be cut out of the subtitles. You'll have to do that by hand.


10. First job, sync to start. Watch the movie in Subtitle Workshop and note the first line spoken, and the time it's spoken (to the left of the playback controls). Find that line in the subtitles below and delete everything before it by highlighting it and hitting delete. Now hit Edit > Timings > Set Delay. You'll likely want to chose the "-" option and "for all subtitles". Now type in the amount of time you wish to deduct from the first subtitle to get it to appear around the time the first line was spoken. You'll have to do this a number of times until it is to your liking. Watch the movie a little and see how / when the subs appear. If it seems ok then move onto the next bit:


11. Cut out any commercial subtitles (find them by watching through the movie and seeing when the subtitles become unsynced again) and then re sync everything after the cut out part to bring it forward in time and line up properly. Do this by highlighting everything after the text you remove and going to Edit > Timings > Set Delay but choosing "for selected subtitles" this time.


12. Once everything is synced correctly, you only have a couple more jobs to do. One is to edit any lines with any spelling mistakes or anything SubRip didn't recognize correctly. Recording O as 0 and ! as I are pretty common ones. The Search > Search tool will be of help with this.


13. You'll probably want to have a final look over the text and possibly edit any lines that are shown twice and things like that. In that case, delete the second line and change the first to last as long as the second. In the UK, we also get lines like the following: "Hey what's your name?" followed by "Hey what's your name?|Charlie." That would make the "Charlie" part pop up under the first line at the time the second line is timed to appear. This can be ok at times, but personally when there's a lot of text doing it I find it annoying, and try to remove that as much as possible by just deleting the contents of the first line from the second.


14. When you're finally done, hit File > Save and you're good to go.


15. Final task - you may want to watch through the whole show (burn it to CD or whatever and watch it on tv) with the subs on and note any places where the subs get messed up, any misspellings, and anything you missed while editing in Subtitle Workshop. You can usually fix this in Subtitle Workshop by either editing the text or changing the timings a little. After this, you should have a perfect recording.


Ok, ok, that takes forever I know. But if it's your all time favourite movie and your Dad no longer hears too well, then it's worth it.

If you don't use SRT subtitles then SubRip or Subtitle Workshop should allow you to create subtitles in a variety of different formats.