Whereas most troubleshooting strategies focus on gathering diagnostics and digging through reams of data to find a problem, the DAMT framework approaches the problem from a different angle.
Now we compare that to Scenario B. Scenario B describes a set of circumstances where the problem does not occur.
Most engineers will dive into diagnostics as soon as they get a brief description of Scenario A, but in doing so they complicate matters in two important ways:
- By comparing Scenario A to Scenario B you gain important information about which factors are important and which are not.
- The initial version of Scenario A may be very complex, producing a large quantity of diagnostic data, whereas it may be possible to significantly simplify Scenario A while still reproducing the problem.
So in the DAMT framework we begin the troubleshooting process by taking two initial steps that will end up saving time later.
- Simplify the problem (Scenario A) as much as possible.
- Eliminate as many differences as you can between Scenario B and Scenario A (without changing the fact that the problem occurs (A) or does not occur (B)).
Let me give you an example.
Imagine you have reports of audio quality on phone calls with a particular analog phone connected to a particular IAD that’s sitting behind a firewall on a customer LAN connected through an enterprise SBC over the public internet to your carrier SBC and then out over a SIP trunk to a cell phone. This is your initial definition of Scenario A.
If you were to start collecting diagnostics right away you have a wide range of possible data to collect from many different devices, and once you’ve gathered it you’ll have a lot to process and it will take a long time to know where to look for the problem.
But let’s see what happens if you seek to simplify Scenario A and create a definition for Scenario B that has minimal differences compared to Scenario A.
- You try switching out the cell phone for another on-switch line: no change.
- You try bringing the problem phone and IAD into your own lab (eliminating the firewall and both SBCs): no change.
- You try replacing the analog phone + IAD with a SIP phone: problem disappears. This gives you a ‘similar-but-different’ scenario B.
- You try connecting a different model of analog phone to the IAD: problem still occurs.
- You try using a different model of IAD: problem disappears.
So at this point you have:
- Scenario A: analog phone + IAD-model-1 to switch to on-switch subscriber.
- Scenario B: analog phone + IAD-model-2 to switch to on-switch subscriber.
You seem to have done a great job of eliminating differences between the two scenarios, but there’s one more test you can try.
- You try using a different IAD-model-1 (fresh out of the box): the problem disappears.
So now you’re able to define the difference that causes the problem – it’s one specific IAD. Not one model of IAD, but one specific instance of that model of IAD that has the problem, as compared to a “fresh out of the box” IAD of the same type.
But you can go further, you can compare the config files on the two IADs, and see whether the IADs still behave differently with identical configurations, and so on and so on until you find the one specific configuration setting, or the one specific firmware version, or the damaged piece of hardware that’s causing your problem.
Did you notice anything unusual about this strategy? At no point did we examine any diagnostics. We were able to solve the entire problem solely by eliminating differences between Scenario A and Scenario B.
Now of course real life can sometimes be more complex than this. Sometimes diagnostics can be helpful to speed up the process, and sometimes diagnostics are necessary to understand why a difference is leading to the problem – but the DAMT framework allows you to make much faster progress towards a solution by focusing on the problem description, rather than wasting time wading through complex technical data.
This can not only save you time, but it also means you can solve problems in areas where your technical expertise is limited. You don’t always need an expert in technology X to troubleshoot a problem, sometimes you can figure out the solution by following this process of elimination – eliminating differences until you find that one key difference that’s the root cause of all your woes.
The diagram below shows the whole DAMT process, from Describing the problem (what we’ve focused on) to Analyzing diagnostics, Modifying your network and finally Testing your new configuration.