Automatic detection of signal within an audio file.

Christophe Pallier

19 May 1998

During the preparation of an audio experiment, it is often necessary to record a series of stimuli, separated by pauses, in a single audio file. Afterwards, one must extract all the stimuli into as indvidual audio files. The programs available on this page help you to automate this somewhat boring task.

Note: In the following, a "sig file" refers to a 16bits linear PCM audio file with Intel byte order.

Wspot reads a sig file and tries to find the portions containing signal. It produces a text output consisting of a series of lines, one for each signal portion detected. Each line contains three columns: the first is the offset of the starting sample, the second is the offset of the ending sample, and the third is a string of characters of the form "sxxx.sig" where xxx is a number increasing on each line. Wspot's command line options allows to set some parameters: the S/N ratio detection threshold, the minimal duration of a signal, the maximum blank allowed inside a signal. Depending on the type of stimuli you are trying to segment (e.g. isolated syllables or whole sentences), you may need to modify these parameters. Also, in case the master recording is noisy, you should consider applying a noise filter to increase the Signal/Noise ratio: this can greatly improve wspot's performance (wspot is based on a simple threshold mechanism, but it is efficient enough to segment high-quality recordings).

Wspot only outputs numbers: to actually extract the signal portions, you need the program "splice". Splice takes a sig file and a three columns text file such as the one produced by wspot. It creates a file with the name given on the third column and puts in it all the samples comprised between the offsets given by the first and the second columns. No more, no less. It does not modify the original sig file.

Wspot and splice are distibuted under the GNU license. The source code of wspot and splice is wspotsrc.tar.gz. It compiles cleanly with gcc on GNU/Linux systems. DOS binaries are also available: wspot.exe and splice.exe

Should you use these programs extensively, you would be nice to cite the author and refer to this page in your written reports.

We separated the signal detection (wspot) and the extraction process (splice) in order to allow the user to modify the decisions of wspot by simply editing the intermediate text file.

One may want to control the job done by wspot, that is, take a look at the boundaries it detected, and move or delete them if necessary. I am working on a simple sound editor in java that would do this job. Meanwhile, you may use the shareware CoolEdit and the programs sig2wav and lab2wav written by Michel Dutat: they allow to import or export wav cues (labels in a wav audio file) to and from an text files a la wspot. The complete cycle is:

  1. record a signal and save it as a headerless, 16 bits signed pcm file (let's call it "aga.sig")
  2. run wspot to produces the labels:
    wspot3 aga.sig >aga.lbn
  3. run sig2wav to create a wav file from the sig file and the text file containing the labels:
    sig2wav -l aga.lbn aga.sig
  4. use CoolEdit' cue list featurev to modify the labels positions and/or their names if necessary
  5. run wav2labs to export the labels to a text file
    wav2labs aga.wav >aga2.lbn
  6. run splice:
    splice aga.sig aga2.lbn

(This procedure is useful if and only if you want to check precisely waht wspot did. In some instances, one can simply call wspot and splice succesively and listen to the obtained files).


Last update: 19 may 1998. Christophe Pallier