Closed Caption Decoding In Python

I’ve been backing up my Laserdiscs after the problems I had with my players a month or so back, one thing I frequently noticed was the flicker of closed captions at the very top of the captured video file. This greatly intrigued me, as it would be useful to have subtitles these films. I began investigating software options for decoding the captions embedded in the video image area, but was unable to find anything that worked with video file embedded captions, or was cheap enough to justify spending the money on. Having been intending to do some video-processing work in Python this seemed like an ideal warm-up exercise.

Closed Caption History

Work began during the 1970s on sending subtitles inside a broadcast NTSC signal. However development took some time – it wasn’t until 16 March 1980 that the first broadcasts of closed captions were made, and then only 15 hours of television a week were captioned.

Enhancements were made to the format over time, and a law was passed in 1990 mandating the inclusion of closed caption decoders in larger televisions in an effort to improve the adoption of the decoders, which up to that point had only sold 400,000 units.

Despite all of this – it took further steps to increase the penetration of closed captioning. The final piece of the puzzle was legislation in 1996 that mandated captioning on television programming.

Environment Setup

Getting Python running acceptably on my home workstation proved a lot tougher than I thought it would. 64 bit Python 2.7 caused all sorts of problems since it expects a 64 bit version of Visual Studio 2008, which isn’t included in the Visual Studio express package. Various FFMpeg wrappers were a washout either failing to compile, grumbling about missing .h files. In the end my recipe was:

This at least gave me Python with the ability to open image files, and FFMPEG can decode pretty much any video stream to images.

Closed Captions Technical Details

Closed captions

Closed captions in action.

There are two closed captions symbols in each frame of video. These form either, a control code, a single special character or two ‘normal’ alphanumeric characters. The symbols are each seven bits, with a single odd parity bit. A start bit also adds synchronization.

Extracting single characters is relatively easy, as the character set is pretty close to ASCII, however understanding the control codes is key to transforming the stream into a usable SRT format file.

A closed caption decoder has three decoding modes:

  • Roll-up mode, generally used for live television, where each new line of text ‘rolls up’ the last one
  • Pop-on mode, used for pre-recorded television, which specifies an offscreen memory area where subtitles are built off-screen, and ‘popped-on’ to the screen.
  • Direct paint mode, used for more complex effects
Raw closed captions

Raw decoded closed captions

I only have a small sample of closed captioned media, that all seems to be in pop-up mode, so I concentrated on that. The second notable feature is multiple subtitles channels, originally this was intended to be four, but nobody seemed to use more than two, so the upper two channels (three and four) were dropped in favor of adding some of the more exotic features in pop-up and direct paint mode. Again – all the media I have only uses channel CC1 – so I haven’t distinguished between CC1 and CC2 subtitles and commands, but it would be trivial to do so.

Closed captions actually have a fairly complicated error handling scheme: A parity bit is set All command codes are sent twice to cope with any errors in the first one.

WHY ARE CLOSED CAPTIONS SHOUTING ALL THE TIME?

Beats me. Having only seen closed captions occasionally I assumed that the standard only supported ALL CAPS. But there is an entirely usable lowercase alphabet there.

Implementing Closed Caption Decoding In Python

SRT format subtitles

SRT format subtitles

Step one was to load an image file with some closed caption bits inside it. The Open CV library has the ability to simply load an image using cv.LoadImage. An RGBA pixel tuple can be easily, but inefficiently read out of the file using cv.Get2D( image, y, x ).

Step two was establishing the position of the data bits in the display. The closed caption line includes a sine wave pre-amble to help determine the timing of the signals, but I simply measured using Photoshop the position of the center of each of the bits, then used a threshold at each of the pixel locations to get a binary value for each of the bits.

With the locations of the bits as a baseline, I wrote a routine to ‘lock’ onto the peaks and troughs of the pre-amble sine wave, this provided an offset to be applied to the location of the data bits, which were simply determined by thresholding the average RGB value at the computed position.

The next step was to decode the two recovered seven bit values in each frame. This required implementing a lookup table, based on the available specification documents. If you are about to implement your own closed caption decoder – be wary – I found several inaccurate sources on the web that gave me garbage output.

With the lookup tables to map the control codes in place, and a trivial state machine with two buffers to emulate the behavior of the decoder in Pop-on mode I was ready to move forward.

Once it could decode individual images, and lists of images read one after the other, the next step was to integrate FFMpeg which can decode practically any form of video. Initially I had FFMPEG writing full frames of video, but this was pretty slow, by using FFMpeg built in clip function I was able to write just the top ten lines of video, and get performance of at least 20x realtime on my i7 workstation.

The SRT format was pretty straightforward to output, the two slightly tricky parts were ensuring that each caption didn’t have any blank lines inside it, and that the start times and stop times of the caption were sensible.

Source Code Package

As with all free software, this has no warranty, and any damage you do with it is on you. You can find everything else you need above. Good luck.

ccDecode

25 May 2014 Update: An improved version is available see here.