• Skip to main content
  • Skip to primary sidebar

Award Consulting

Metaswitch consultants

  • Home
  • About
  • Services
  • Questions
  • Training
  • Articles
  • Podcast
  • Contact

Dammit! Solve problems faster with the DAMT Troubleshooting Framework

January 2, 2018 By Andrew

 

Do you ever encounter problems in your voice network? Issues with stability or functional defects? Are you responsible for tracking down these issues and resolving them? If so, you should try the DAMT (pronounced ‘dammit’) troubleshooting framework. This is a powerful framework we’ve developed at Award Consulting based on our decades of experience troubleshooting issues in both TDM and VoIP networks.

Whereas most troubleshooting strategies focus on gathering diagnostics and digging through reams of data to find a problem, the DAMT framework approaches the problem from a different angle. 

Let’s start by defining your problem as Scenario A. Scenario A includes all the circumstances (phone, IP network, configuration settings, carrier, time of day, and so on) that are in place when your problem reliably occurs.

Now we compare that to Scenario B. Scenario B describes a set of circumstances where the problem does not occur.

Most engineers will dive into diagnostics as soon as they get a brief description of Scenario A, but in doing so they complicate matters in two important ways:

  • By comparing Scenario A to Scenario B you gain important information about which factors are important and which are not.
  • The initial version of Scenario A may be very complex, producing a large quantity of diagnostic data, whereas it may be possible to significantly simplify Scenario A while still reproducing the problem.

So in the DAMT framework we begin the troubleshooting process by taking two initial steps that will end up saving time later.

  1. Simplify the problem (Scenario A) as much as possible.
  2. Eliminate as many differences as you can between Scenario B and Scenario A (without changing the fact that the problem occurs (A) or does not occur (B)).

Let me give you an example. 

Imagine you have reports of audio quality on phone calls with a particular analog phone connected to a particular IAD that’s sitting behind a firewall on a customer LAN connected through an enterprise SBC over the public internet to your carrier SBC and then out over a SIP trunk to a cell phone. This is your initial definition of Scenario A.

If you were to start collecting diagnostics right away you have a wide range of possible data to collect from many different devices, and once you’ve gathered it you’ll have a lot to process and it will take a long time to know where to look for the problem.

But let’s see what happens if you seek to simplify Scenario A and create a definition for Scenario B that has minimal differences compared to Scenario A.

  • You try switching out the cell phone for another on-switch line: no change.
  • You try bringing the problem phone and IAD into your own lab (eliminating the firewall and both SBCs): no change.
  • You try replacing the analog phone + IAD with a SIP phone: problem disappears. This gives you a ‘similar-but-different’ scenario B.
  • You try connecting a different model of analog phone to the IAD: problem still occurs.
  • You try using a different model of IAD: problem disappears.

So at this point you have:

  • Scenario A: analog phone + IAD-model-1 to switch to on-switch subscriber.
  • Scenario B: analog phone + IAD-model-2 to switch to on-switch subscriber.

You seem to have done a great job of eliminating differences between the two scenarios, but there’s one more test you can try.

  • You try using a different IAD-model-1 (fresh out of the box): the problem disappears.

So now you’re able to define the difference that causes the problem – it’s one specific IAD. Not one model of IAD, but one specific instance of that model of IAD that has the problem, as compared to a “fresh out of the box” IAD of the same type.

But you can go further, you can compare the config files on the two IADs, and see whether the IADs still behave differently with identical configurations, and so on and so on until you find the one specific configuration setting, or the one specific firmware version, or the damaged piece of hardware that’s causing your problem.

Did you notice anything unusual about this strategy? At no point did we examine any diagnostics. We were able to solve the entire problem solely by eliminating differences between Scenario A and Scenario B.

Now of course real life can sometimes be more complex than this. Sometimes diagnostics can be helpful to speed up the process, and sometimes diagnostics are necessary to understand why a difference is leading to the problem – but the DAMT framework allows you to make much faster progress towards a solution by focusing on the problem description, rather than wasting time wading through complex technical data.

This can not only save you time, but it also means you can solve problems in areas where your technical expertise is limited. You don’t always need an expert in technology X to troubleshoot a problem, sometimes you can figure out the solution by following this process of elimination – eliminating differences until you find that one key difference that’s the root cause of all your woes.

The diagram below shows the whole DAMT process, from Describing the problem (what we’ve focused on) to Analyzing diagnostics, Modifying your network and finally Testing your new configuration. 

Picture

We also have a helpful one-page PDF you can print out and stick on your wall as a reference for the future. Simply enter your email address below and we’ll send you the PDF. You’ll also be subscribed to our mailing list where we send articles about twice every month focused specifically on the needs of service providers. ​



About Andrew

Award Consulting is focused on helping ILECs and CLECs who use Metaswitch products to thrive as they improve their networks through migrations, strategic projects and improved service offerings.

Our goal is to create highly specific, highly valuable content targeted specifically at US regional service providers, and especially those who are running Metaswitch equipment. Join our email list to be notified of new content.

Primary Sidebar

Our goal is to create highly specific, highly valuable content targeted specifically at US regional service providers, and especially those who are running Metaswitch equipment. Join our email list to be notified of new content.



Articles by Theme

  • Hosted PBX (17)
  • Interviews (1)
  • IP Networks (7)
  • Network Evolution (25)
  • Network Ops (56)
  • Product (17)
  • STIR-SHAKEN (23)
  • Strategy (18)
  • Technical (32)

Copyright © Award Consulting Services 2023