Go to the first, previous, next, last section, table of contents.

Audio Processing

Octave provides a few functions for dealing with audio data. An audio `sample' is a single output value from an A/D converter, i.e., a small integer number (usually 8 or 16 bits), and audio data is just a series of such samples. It can be characterized by three parameters: the sampling rate (measured in samples per second or Hz, e.g. 8000 or 44100), the number of bits per sample (e.g. 8 or 16), and the number of channels (1 for mono, 2 for stereo, etc.).

There are many different formats for representing such data. Currently, only the two most popular, linear encoding and mu-law encoding, are supported by Octave. There is an excellent FAQ on audio formats by Guido van Rossum <guido@cwi.nl> which can be found at any FAQ ftp site, in particular in the directory `/pub/usenet/news.answers/audio-fmts' of the archive site rtfm.mit.edu.

Octave simply treats audio data as vectors of samples (non-mono data are not supported yet). It is assumed that audio files using linear encoding have one of the extensions `lin' or `raw', and that files holding data in mu-law encoding end in `au', `mu', or `snd'.

Function File: lin2mu (x): If the vector x represents mono audio data in 8- or 16-bit linear encoding, lin2mu (x) is the corresponding mu-law encoding.

Function File: mu2lin (x, bps): If the vector x represents mono audio data in mu-law encoding, mu2lin converts it to linear encoding. The optional argument bps specifies whether the input data uses 8 bit per sample (default) or 16 bit.

Function File: loadaudio (name, ext, bps)

Loads audio data from the file `name.ext' into the vector x.

The extension ext determines how the data in the audio file is interpreted; the extensions `lin' (default) and `raw' correspond to linear, the extensions `au', `mu', or `snd' to mu-law encoding.

The argument bps can be either 8 (default) or 16, and specifies the number of bits per sample used in the audio file.

Function File: saveaudio (name, x, ext, bps): Saves a vector x of audio data to the file `name.ext'. The optional parameters ext and bps determine the encoding and the number of bits per sample used in the audio file (see loadaudio); defaults are `lin' and 8, respectively.

The following functions for audio I/O require special A/D hardware and operating system support. It is assumed that audio data in linear encoding can be played and recorded by reading from and writing to `/dev/dsp', and that similarly `/dev/audio' is used for mu-law encoding. These file names are system-dependent. Improvements so that these functions will work without modification on a wide variety of hardware are welcome.

Function File: playaudio (name, ext)
Function File: playaudio (x): Plays the audio file `name.ext' or the audio data stored in the vector x.

Function File: record (sec, sampling_rate): Records sec seconds of audio input into the vector x. The default value for sampling_rate is 8000 samples per second, or 8kHz. The program waits until the user types RET and then immediately starts to record.

Function File: setaudio (type)

Function File: setaudio (type, value)

Set or display various properties of your mixer hardware.

For example, if vol corresponds to the volume property, you can set it to 50 (percent) by setaudio ("vol", 50).

This is an simple experimental program to control the audio hardware settings. It assumes that there is a mixer program which can be used as mixer type value, and simply executes system ("mixer type value"). Future releases might get rid of this assumption by using the fcntl interface.

Go to the first, previous, next, last section, table of contents.