For some reason we have recently encountered a cluster of issues recently where DTMF tones are not successfully traversing the network. So what better time for a quick educational article explaining how DTMF is supposed to work in a VoIP network?
What is DTMF?
In case anyone is not familiar, DTMF stands for dual-tone multi-frequency signaling. These DTMF tones are the sounds you hear when you hit the numbers on a touch tone key pad. Each number (plus potentially the letters A-D and * and #) has a different combination of two tones, uniquely identifying that particular symbol.
The primary use for DTMF tones is when interacting with some kind of automated system – e.g. an auto-attendant, conference bridge or voicemail.
How does it work over TDM?
In analog networks these sounds are created by the endpoint (the phone) and then sent as in-band audio in the call path.
How does it work with VoIP?
This is where things get tricky. Since the information being sent is fundamentally just numbers and symbols, various different standards have been introduced over the years to “simplify” the signaling of DTMF – and to avoid any issues caused by poor audio quality (packet loss, distortion due to audio compression, etc). The main options I’m aware of are:
- In-band signaling – this is just the same as for analog calls, the DTMF tones are simply part of the audio stream.
- RFC 2833 signaling – with this standard, the DTMF digits are sent as a special type of RTP packet. So the digits are still part of the RTP stream, but the digit is encoded as part of the packet, not as audio. See the screenshot below for how this appears in Wireshark. As you can see below, each key press actually appears repeatedly in the RTP stream to indicate the duration of the tone.

- SIP INFO – another out-of-band option is to signal the DTMF digits as a separate SIP INFO request, using a mechanism available in RFC 6086.
Which is best?
RFC 2833 is widely used and works fine, so personally I wish everyone would just stick with that. Maybe there are some technical reasons (bandwidth efficiency?) why some would prefer SIP INFO messages, but mostly the proliferation of options just causes interoperability problems – so it would be better if we could all stick with one approach throughout the PSTN.
How do voice networks decide which to use?
It’s complicated – which is why we often see issues.
In theory, a lot of this can be controlled by dynamic media negotiation. For example, take a look at these lines from the session description protocol (SDP) of a SIP INVITE:
m=audio 19708 RTP/AVP 8 0 18 101
a=rtpmap:101 telephone-event/8000
a=fmtp:101 0-15
SDP isn’t exactly easy to read, but the above lines are advertising that this SIP endpoint supports RFC 2833 for telephone-events (that’s the 101 at the end of the first line, which then is defined by the rtpmap and fmtp lines).
However SIP endpoints tend to be inconsistent about which options they support, and some may not follow the requests provided by the other side in a negotiation, so many SIP devices (including Metaswitch CFS and Perimeta products) also offer configuration options to control what kind of DTMF signaling should be sent to a particular endpoint.
For example, on the CFS you’ll see the following fixbits on your remote media gateway models (among others):
- Does not support out of band DTMF (i.e. you must send the tones in the audio)
- Only supports sending out of band DTMF (i.e. the far-end can send, but not receive)
- Requires out-of-band DTMF for all codecs (including non-compressed codecs like G.711)
- Send DTMF in SIP INFO
And on the Perimeta SBC, there are a variety of DTMF options that can be configured on the interop section of an adjacency, or in an interop profile.
What should I do if I’m having problems?
This is a complex topic, but to get you started I would suggest the following steps.
- Verify what methods are supported by each endpoint in your network (e.g. ONT, CFS, Perimeta, carrier SIP trunk, SIP PBX, application servers) – and ideally see if they will all support RFC 2833.
- Configure them all (in their own settings) to use RFC 2833, if possible.
- Test calls with DTMF, in both directions, and pray that everything works.
- If it doesn’t, use Wireshark captures to get a better idea of what is actually being sent and received. (Check-out this helpful article for how to decode a WAV file to confirm whether the tones themselves are accurate.)
- See if one of the mid-points of your network can act as a DTMF interworking hub, to convert from one method to another – if different parts of your network support different options.
- If all else fails, set up a support engagement with your friendly neighborhood VoIP consultant and ask for help. 🙂