Why not register?


Post new topic Reply to topic  [ 7 posts ] 

All times are UTC [ DST ]

Author Message
PostPosted: Sat Jan 10, 2009 1:07 pm  Post subject: Perl script to fix SRT subtitles
Reply with quote
User avatar
Offline

The Devil, Probably
Joined: Fri Oct 26, 2007 3:40 pm
Posts: 2250
Location: Inside my body.
Perl script to fix SRT subtitles

I would like to share my Perl script, which fixes some common errors in SRT subtitles:
  • Removes empty subtitle entries
  • Removes leading and trailing spaces from the subtitle entries

Examples
Example input:
150
00:23:52,493 --> 00:23:55,493
One or the other, motherfucker.
One or the other!

151
00:24:42,943 --> 00:24:45,411

152
00:24:45,412 --> 00:24:47,246
I've already sent for backup.

Example result:
150
00:23:52,493 --> 00:23:55,493
One or the other, motherfucker.
One or the other!

151
00:24:45,412 --> 00:24:47,246
I've already sent for backup.

The script itself
Here's the Perl script:
#!/usr/bin/perl -w
# For Windows and ActivePerl change the line above to: #!perl

# -------1---------2---------3---------4---------5---------6---------7---------8
use strict; # force variable declarations
use Socket qw(:DEFAULT :crlf); # for portable newline processing

# Define string constants so you could translate them to your own language.
use constant USAGE_TEXT => "Usage: strchk [INPUT-FILE] [OUTPUT-FILE]";
use constant ERR_FILE_FORMAT => "Input file is not a text file!";
use constant ERR_WRITE_DENY => "You don't have privileges to write to: ";
use constant ERR_READ_DENY => "You don't have privileges to read from: ";
use constant ERR_DIRECTORY => "Output file is a directory!";
use constant ASK_OVERWRITE => "Output file exists!\nOverwrite? [y/N] ";
use constant MSG_EXIT => "Exiting.";
use constant CHAR_LOWER_YES => 'y';

# Retrieve input and output file name from the command line arguments.
my $input_file = shift @ARGV;
my $output_file = shift @ARGV;

# Exit and print usage info if one of the command line arguments is missing.
die USAGE_TEXT, "\n" if (not defined $input_file or not defined $output_file);
# Check whether the user has privileges to read the input file.
die ERR_READ_DENY, $input_file, "\n" if (not -r $input_file);
# Check whether the input file is a text file.
die ERR_FILE_FORMAT, "\n" if (not -T $input_file);
# Check whether the ouput file is a directory.
die ERR_DIRECTORY, "\n" if (-d $output_file);
# Check if output file is already present. If yes: ask whether to overwrite it.
if (-e $output_file)
{
print ASK_OVERWRITE;
my $key = getc; # get user input

# Exit if user doesn't want to overwrite the output file.
die MSG_EXIT, "\n" if($key ne CHAR_LOWER_YES and $key ne uc(CHAR_LOWER_YES));
# Check whether the user has privileges to create the output file.
die ERR_WRITE_DENY, $output_file, "\n" if (not -w $output_file);
}

# Processing of subtitle file starts here ...
open INPUT_FILE_HANDLE, "<", $input_file; # open input file for read access
open OUTPUT_FILE_HANDLE, ">", $output_file # open output file for write access
or die ERR_WRITE_DENY, $output_file, "\n";

my $output_id = 1; # ID of the current output subtitle entry
my $time_string = ""; # time string of the current input subtitle entry

my $entry_active = 0; # flag, whether a subtitle entry is being processed

# Read through all lines of the input file
while( my $line = <INPUT_FILE_HANDLE> )
{
# replace CRLF and LF with a logical newline character
$line =~ s/$CR?$LF/\n/;

# Trim leading and trailing spaces and the new-line character from the line
$line =~ s/^ *(.*?) *\n$/$1/;

# Check if the current line is a subtitle ID line
if ( not $entry_active and $line =~ /^\d+$/ )
{
$entry_active = 1; # set flag that an entry is being processed
}
# Check if the current line is empty
elsif ( $entry_active and not $line )
{
# Print a new-line as separator when the entry was NOT empty.
if ( $entry_active > 1 ) { print OUTPUT_FILE_HANDLE "\n"; }
$entry_active = 0;
$time_string = "";
}
# Check if the line is supposed to be the time string
elsif ( $entry_active and not $time_string )
{
$time_string = $line; # store the current time string
}
# Check if the line is the first NOT empty line of the subtitle entry
elsif ( $entry_active == 1 )
{
print OUTPUT_FILE_HANDLE $output_id++ , "\n"; # write subtitle ID
print OUTPUT_FILE_HANDLE $time_string, "\n"; # and time string
print OUTPUT_FILE_HANDLE $line, "\n"; # and first line of text
$entry_active = 2;
}
# Check if the line still belons to the current subtitle entry
elsif ( $entry_active > 1 )
{
print OUTPUT_FILE_HANDLE $line, "\n"; # write line of text
}
}# while

How to use it
To run this on windows:
  • Download and install ActivePerl: http://downloads.activestate.com/ActivePerl/Windows/
  • Copy the posted script into a file and save it
  • Replace the first line as described in the script!
  • Call the ActivePerl interpreter from the command line with the script file as parameter

To run this on linux:
  • Make sure that the Perl interpreter is installed
  • Copy the posted script into a file and save it
  • Run perl with the file as parameter.

Example:
perl srtchk.pl Castle.of.the.Walking.Dead.1967.English.Subs.srt Castle.of.the.Walking.Dead.1967.English.Result.Subs.srt


I hope it comes handy. 8)

_________________
Keep downloaded files shared as long as possible! You wouldn't be able to download without people sharing the stuff.

Image Image Image Image Image Image
Image

Next release will be: La Cité Des Entfants Perdu (1995) + Making Of (Status: working on the subtitles)
Last release was: Svengali (1931)


Last edited by RedVeil on Sun Dec 27, 2009 2:25 pm, edited 3 times in total.

Top
 Profile  
PostPosted: Sat Jan 10, 2009 2:27 pm  Post subject: Re: Perl script to fix SRT subtitles
Reply with quote
User avatar
Offline

Site Admin
Joined: Sat Nov 02, 2002 1:35 am
Posts: 19753
Location: En España
Image

:lol:

_________________
Mouse nipple for the win! Trackpoint or death!


Top
 Profile  
PostPosted: Sat Jan 10, 2009 6:28 pm  Post subject: Re: Perl script to fix SRT subtitles
Reply with quote
User avatar
Offline

The Devil, Probably
Joined: Fri Oct 26, 2007 3:40 pm
Posts: 2250
Location: Inside my body.
This is not a new skill of mine. I actually ordered the O'Reilly book some years ago and decided to learn Perl.
Yesterday the VobSub to SubRip OCR generated an SRT file with some missing entries at the beginning, so Perl seemed a fast way to renumber the entries instead of doing it by hand - after all I am a software developer. :wink:

So if anyone needs help with SRT files, feel free to use the script above.

_________________
Keep downloaded files shared as long as possible! You wouldn't be able to download without people sharing the stuff.

Image Image Image Image Image Image
Image

Next release will be: La Cité Des Entfants Perdu (1995) + Making Of (Status: working on the subtitles)
Last release was: Svengali (1931)


Top
 Profile  
PostPosted: Sat Jan 10, 2009 6:35 pm  Post subject: Re: Perl script to fix SRT subtitles
Reply with quote
User avatar
Offline

Site Admin
Joined: Sat Nov 02, 2002 1:35 am
Posts: 19753
Location: En España
Not making comment, just when ever I or anyone at work mentions perl that comic just springs to mind and I burst out laughing.

I have a t-shirt with it on the back :D

_________________
Mouse nipple for the win! Trackpoint or death!


Top
 Profile  
PostPosted: Sat Jan 10, 2009 7:01 pm  Post subject: Re: Perl script to fix SRT subtitles
Reply with quote
User avatar
Offline

The Devil, Probably
Joined: Fri Oct 26, 2007 3:40 pm
Posts: 2250
Location: Inside my body.
OK that explains something to me. :wink:

_________________
Keep downloaded files shared as long as possible! You wouldn't be able to download without people sharing the stuff.

Image Image Image Image Image Image
Image

Next release will be: La Cité Des Entfants Perdu (1995) + Making Of (Status: working on the subtitles)
Last release was: Svengali (1931)


Top
 Profile  
PostPosted: Sat Jan 10, 2009 9:57 pm  Post subject: Re: Perl script to fix SRT subtitles
Reply with quote
User avatar
Offline

Demon Of The Abyss
Joined: Wed Mar 03, 2004 2:58 am
Posts: 1103
spudthedestroyer wrote:
Not making comment, just when ever I or anyone at work mentions perl that comic just springs to mind and I burst out laughing.

I have a t-shirt with it on the back :D


you are such a nerd

_________________
Image


Top
 Profile  
PostPosted: Sun Dec 27, 2009 2:25 pm  Post subject: Re: Perl script to fix SRT subtitles
Reply with quote
User avatar
Offline

The Devil, Probably
Joined: Fri Oct 26, 2007 3:40 pm
Posts: 2250
Location: Inside my body.
Applied a fix to make the script work with CRLF and LF newline characters.

_________________
Keep downloaded files shared as long as possible! You wouldn't be able to download without people sharing the stuff.

Image Image Image Image Image Image
Image

Next release will be: La Cité Des Entfants Perdu (1995) + Making Of (Status: working on the subtitles)
Last release was: Svengali (1931)


Top
 Profile  
Display posts from previous:  Sort by  

All times are UTC [ DST ]

Post new topic Reply to topic  [ 7 posts ] 


Who is online

Users browsing this forum: No registered users and 6 guests


Moderator: Help Mods

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
cron
Frontpage / Forums / Scifi


What's blood for, if not for shedding?