.....Voice Programming.....

Recording with handset

The modem will begin to record using the locally attached handset as the audio input device. The sampled sound information is compressed using the GSM 6.10 codec. The resulting data stream is presented across the serial port as 8bit binary data. The format of the data is in 38 byte frames as follows:

or...

Frames beginning with 0xB6B6 denote a GSM frame that constitutes voice data that is below an arbitrary silence threshold set by the DSP. 0xB6B6 is only a marker for what the modem considers silence. Frames that begin with 0xB6B6 still contain valid GSM encoded voice data and may be played back using either the 0xB6B6 or 0xFEFE frame header.

These frame headers and footers should be kept if only the modem is being used for record and playback. They should be stripped if GSM decoding is to take place by a process within the DTE.

The 34th byte of GSM frame data is always 0 and is LSB resulting in a standard 33 byte GSM data format.

To stop the data stream, the modem requires a "keypress abort". This can be any single character sent by the DTE. For consistency, use 0x10. Any frames sent by the modem after the keypress abort contain encoded voice data that existed in the modem's buffers at the time of the keypress abort. They should be kept for a complete voice sample. The data stream will end with a <DLE><ETX> character pair.

If the GSM compression of voice samples results in such a way that the <DLE> character is part of the data stream, an extra <DLE> will be inserted 1 byte prior in the data stream. The first <DLE> should be stripped from the data stream for proper playback.

While recording, the modem will detect DTMF events, dialtone, silence, or fax calling tones as defined in the #V command reference. Notification of these events occurs as a <DLE><[character]> pair. Both the <DLE> and following DTMF code should be stripped from the data stream for proper playback.

Playback with handset

The transmisstion of properly encapsulated GSM voice data should begin immediately after the modem issues the CONNECT response.

The voice data to be played must be in the format sent by the modem at time of recording. If the voice data is the product of a 3rd party recording device (sound card), it must be previously encoded using GSM compression and properly framed as described above. Use 0xFEFE as the frame header for all frames.

To signify the end of the voice data stream, the character pair <DLE><ETX> must be issued to the modem. Upon complete playback of the voice data stream, the modem will return to voice command state and signify this be issuing the VCON response to the DTE.