SMPTE and Video In the Electronic Music Studio


The development of inexpensive consumer video equipment and MIDI Time Code is making it possible for composers to easily link sound and images. The technical limitations of the cheap video gear makes it impossible to create finished work with the picture quality of commercial broadcast, but the sound can usually be a good deal better. In any case, practice in composing to fit images will prepare us all for the day when we do get a chance to play with the high quality toys.


The basic technique needed for composing music for video is synchronization. When a system is synchronized, all of the video and audio decks, computer sequencers, hard disks and so forth will play back exactly together. This allows you to compose layers of sound, secure in the knowledge that once you get something in place, it will stay there.

In any synchronized system, there is one MASTER machine, and one or more SLAVES. When you play the master, the others will follow (chase). If you want to listen to a slave by itself, you usually have to take it "off-line", or it will keep jumping back to where the master is parked.

When audio and video machines are slaved, the master must be a video deck. (In a system with several video decks, there is a master clock, and the video decks are genlocked to that. Genlocking is a super precise synchronization, not just frame by frame but line by line.)

The procedures used to compose video scores in the studio will depend on the format and content of the video we are working with. It will probably be a VHS cassette. It may have:

Presumably, the video is a dub of a final edit made elesewhere. If it is an original, you should make a working dub. (repeated passes through the VCR will degrade the image rather quickly) If there is any audio on the cassette that you need to preserve (dialogue etc), you must have an audio deck that can be kept in sync with video. Use this patch with a Tascam TSR-8:

You will wind up with a second generation video cassette with time code on the audio track, and an eight track tape with audio on track one and time code on track 8. When you sync the two with the Midizer (the TSR-8 synchronizer-- other machines need a similar device) the dialogue will match the picture and you can add music to the other tracks. Later, you will put time code on the original video cassette and use the offset feature of the midizer to resolve the image with the sound.

In the simpler situation, where you have a picture-only video to work with, you can synchronize MIDI generated music. Use this patch to add time code to the video.

Then, when you play the video, Vision or a similar program on the MAC will sync up. Later, you will dub video to a new casette as you mix the audio tracks. (TC is a Cooper PPS-2 or similar devive that converts Midi Time Code to SMPTE and vice versa.)


SMPTE time code is a method of marking tape with a continous readout of time. (Imagine taking a grease pencil and writing the time every fifteen inches-- you would always know where you are.) This is much more accurate than any mechanical tape counter because there is no slippage and you don't have to remember to reset it. Once a tape is striped (that means the code is recorded) the exact time is always available.

SMPTE code is recorded on one of the audio tracks of a tape recorder. This has two disadvantages; you lose a track for recording , (except with some new digital recorders) and the code can only be read in play mode. SMPTE code is really a modulated tone-- a decoder circuit can convert this tone into data: there is an 80 bit word for each frame. This is enough to tell the hour, minute, second, and frame number with some left over for special uses called subcode. When SMPTE is on the tape, a variety of gadgets can perform several useful functions:

A SMPTE GENERATOR creates the time code in the first place. You can usually start with any arbitrary time. There are several formats of time code, explained a bit later in this article.

A SMPTE READER simply displays the time-- (some readers will also add this display to a video signal. That is called a "window", and a tape made with this machine is called a window dub.)

A REGENERATOR reads the code and recreates it for recording on another deck. Directly dubbing the signal from one machine to another will not work reliably.

A MIDI TIME CODE GENERATOR converts the SMPTE code into the MTC format, which may be transmitted over MIDI cables.

AN SPS CONVERTER converts SMPTE time code into song pointer and midi clock messages that can control sequencers. In order to do this, the device must be programmed with a TEMPO MAP indicating the time signature and tempo of the music.

FSK is a system for recording MIDI song pointer and clocks directly on tape. This is simpler than SMPTE, but lacks the flexibility. This is an option on most converters. The different brands of FSK converters won't read each other's codes.

A CONTROLLER is an intelligent remote control for a tape deck. It uses SMPTE code to keep track of where the tape is, and can cue the tape up to any desired location. Since the code is not available in fast wind, the controller also gets information from the mechanical tape locator on the deck (these are called tach and direction signals) to roughly spot the tape; it will go into play briefly to confirm the location. A controller has to be specifically designed to work with a particular tape deck. A device that adapts a controller to work with a different tape deck is called an EMULATOR.

A SYNCRONIZER controls two or more tape decks. One of the decks is designated the MASTER, and the others are SLAVES. (All decks must have SMPTE code on their tape) If the controller is placed in CHASE mode, it will keep the slave tapes lined up with the master. To do this the controller must control the speed of the slave decks as well as transport operation; a group of decks playing this way are said to be locked. There are some variations on locking:

A deck may have an OFFSET-- instead of matching times, the synchronizer keeps the slave a specified distance ahead of or behind the master. This is very handy for copying licks into more than one place on a master. (The related process of copying a section of a tape onto another tape and dubbing it back on at another point is called "flying in" tracks.)

The decks may be in PHASE LOCK, meaning the speeds are kept the same, but the absolute times are ignored. This is useful for matching tapes that have been edited without being restriped with time code.

If the master or slave code disappears briefly, the synchronizer will use the tach pulses to attempt synchronization; this is called "flywheeling". When the code comes back it may have an unexpected value. If the synchronizer can do SLOW LOCK, it will gradually bring the decks together.

Video decks must be synchronized even closer than SMPTE allows for stable picture when editing. This is called GENLOCK, and is a direct connection between machines.


There is a lot of confusion about SMPTE types and styles. This has developed out of the general chaos that reigns worldwide in video and film formats. Here is a brief description of what is out there:


Film and video are a series of still pictures (called frames) that flash past our eyes so fast the images seem to move. Since the hand cranked silent days (when 16 frames/sec was a sort of average) four rates have been standardized.

Film. Some interesting shenanigans are required to get this shown on tv.
European video (PAL, SECAM).
30 fps
American and Japanese black and white Video . Video actually runs at 60 fields/sec, but each field is only half the frame (every other line). Very little black and white video is produced any more, but this frame rate lives on in the SMPTE spec. Actually, most 30 fps SMPTE code is generated for audio in large recording studios that like to sync multitrack decks together.
American and Japanese color video (NTSC). The frame rate got changed slightly during the conversion from black and white to color. Almost all video derived SMPTE runs at this rate. You can lock an audio deck striped with 30 fps code to this, but the pitch will drop slightly (from A440 to A335.6).


Each frame message contains the hour, minute, second and frame number of the program. There are two schools of thought about how to deal with the odd division of frames into seconds created by the color rate. One set of engineers simply slowed down the code generators and paid no attention to real time clocks (this works fine as long as you remember to make each one hour program 59 minutes and 56.4 seconds long). For consistency with the real world, a new numbering scheme for frames was developed. The DROP FRAME numbering system skips two numbers at the start of each minute except those ending in 0. This keeps things nicely aligned. Unfortunately, drop frame has not caught on very well, so you have to be able to deal with both systems.To keep life interesting, the old system is called Non Drop Frame or NDF.

The 29.97 frame rate seems to cause a lot of confusion, whether it's DF or NDF. Just remember color is slower, and you'll be all right.


Midi Time Code (MTC) is a special version of SMPTE that can be transmitted over MIDI cables. Most modern sequencing and hard disk recording programs can sync to it, so it is commonly used in composing studios.


Midi Machine Control is a system of controlling tape decks and the like via MIDI cables. With such a system, the computer replaces the synchronizer and remote controls previously needed at a fraction of the cost.


There are three audio formats found on VHS cassette decks. The original format was a single separate audio track at the edge of the tape. Since the tape in a VCR moves quite slowly, this is low quality audio. It is called the longitudinal audio track.

"Hi-Fi Stereo" is a consumer product in which two channels of sound are frequency modulated and mixed in with the video tracks. This is the same way sound gets to your home television set and works very well. In fact, depending on the quality of the other audio circuits in the machine, performance can rival good analog reel to reel recorders. Tapes made this way still have the longitudinal audio track. Some decks allow you to record different material on the LAT and mix it with the stereo tracks for playback. This is useful if you want to add narration, but never combine the same signal from the stereo and LATs.

Industrial VHS decks have two audio tracks stuffed into the same space as the normal LAT. A consumer deck will play both tracks, but there will be extra noise and phase errors if there is common signal. This format usually has SMPTE on one track and audio on the other. Because SMPTE readers are pretty fussy about the quality of the time code, you will not have much luck syncing to this kind of tape on a consumer VCR.

There are two common types of problem encountered in reading time code. If the signal off the tape is too weak or lacks high frequency, most readers will not follow it. So:

It is not unusual for the time code to be read incorrectly. This happens when there are short dropouts in the tape (more likely on the video tape than the audio tape) and some bits are misinterpreted. In that case, the slave deck may be sent shooting off to some point hours away. To avoid this:

In any case, remember to leave a lot of code before and after your tracks on the audio tape so the machine will never try to lock to leader.

SMPTE in the Digital World

Now that high quality digital audio systems are available, they are becoming the format of choice for dealing with audio for video. Computer based editing is particularly useful for the arduous task of post production audio sweetening. However, most digital systems have their own timing constraints and don't work well as slaves to video decks. DAT machines that will lock to SMPTE are available, but really expensive. The same story applies to computers. If you want true synchronization with a Digidesign system you need a $3000 interface card. The brightest star in the sky is a $700 box from JL Cooper that synchronizes ADAT systems to SMPTE. Some of the new stand alone hard disk recorders offer SMPTE options for a similar price.

You can use computer based editors with video without a sync card. The trick is to organize your sound in short clips or "regions". Your editing software (such as Studio Vision or Logic Audio) will read MIDI Time code and begin playing the clips at the precise time desired. They will drift out of sync if they are longer than a couple of minutes, but this is not a limitation on most material. (Studio Vision has a "Lock Audio to Tape" option that will keep the audio tightly synchronized at the expense of some distortion.)

The technology to watch is QuickTime Video. This was a toy when introduced a couple of years ago, but if it keeps improving as fast as it has been, it'll be indistinguishable from tape based video by the turn of the century. Useful QuickTime audio editors are already available and giving hardware based systems a run for the money.