By: Adam J. Taylor
Adam J. Taylor is PTI’s Education and Training Manager. He writes all of the training programs for PTI’s Training Portal. He has worked in various roles at PTI for over seven years
The philosophy governing PTI Training is to give the trainee complete hardware and software knowledge. Installers and technicians must understand the correct product functionality to quickly find and fix issues with your access control system. This philosophy does not mean troubleshooting skills are not needed; quite the opposite. Below is an updated version of a document showcasing troubleshooting fundamentals published twenty years ago. The troubleshooting methods highlighted below still hold, putting aside the original document’s age.
1. Get to the Root Cause: Ask Why Five Times
The keys to genuinely solving a problem are first understanding it. Too often, people quickly jump to solve a problem before finding its cause. Seeing a symptom and thinking that you know its origin is easy, but often, if you take the time to explore more deeply, you will find the cause is, in fact, just another symptom and that the problem lies much deeper within. The goal is not to correct the effects of the problem but to find the root of why the problem is occurring so that you can ensure that it will not happen in the future.
A straightforward way to troubleshoot is to ask ‘why’ five times. The idea is that by the fifth time asking ‘why,’ you will be at the root cause. It isn’t always that simple, but the exercise can be surprisingly insightful in helping you figure out what is going on and help avoid “quick fix”/”band-aid” solutions that don’t resolve anything. It is beneficial for tackling chronic problems that repeatedly occur in a system; it is less useful for problems that are unlikely to recur.
2. Be Observant and Look for Evidence
Sometimes subtle signs can provide information that leads to discovering a problem’s source. In particular, you want to carefully investigate anything that seems “unusual,” “wrong,” or “surprising.” These are often the clues that will get you on the path to figuring out what is causing the trouble. Being observant helps pick up on things, and if you are in a hurry or are quick to discount, it could skew your findings as “unrelated to the problem.”
3. Don’t Forget to Write Things Down
Being observant includes documenting your findings whenever you discern a piece of evidence or figure out something about the current situation. Keeping a history of what you discover will help you anticipate problems and future difficulties. A system logbook is a good idea.
4. Use the Process of Elimination
Virtually all problems with an access control system involve more than one component or subsystem. The tricky part of the process is figuring out which part is responsible. The elimination process can narrow the problem by making small logical changes and observing the impact. The objective is to isolate the cause so you can correct it. The key is to make only one change at a time and see if the problem goes away; if it does, whatever changes likely resolve the issue. Making more than one change at a time hinders you from discerning which change was responsible for fixing the problem. First, check the most probable sources of the problem and the things that are easiest to change.
Example — Troubleshooting Keypads
Suppose you are having a problem with unrecognized keypads. In that case, it’s easier and cheaper to explore things like double-checking connections or stripping wires than to try replacing the keypad itself. Replacing the keypad is something you’d only do after you had eliminated all the other possibilities.
Another example supposes a keypad will not accept codes. You enter the manager code, and nothing happens. There could be many possible reasons for this problem:
- The power to the Controller could be out
- There could be a malfunction in the keypad
- The Controller could have incurred damage
To figure out what is going on, eliminate variables by making small changes and seeing what happens.
PTI recommends changing the keypad with a known working device. If the keypad begins to work, you have isolated the cause to the keypad itself. If the problem persists, try switching the keypad locations. If the ‘entry’ keypad does not work, but the ‘exit’ keypad does, switch the ‘exit’ with the ‘entry’ keypad and retest the system. If the problem still occurs, try resetting the entire system.
5. Make Use of “Known Good” Components
One valuable resource while troubleshooting is a “stockpile” of extra components. I put “stockpile” in quotes because it doesn’t have to be an actual stock of components, just a resource that will let you borrow “known good” components for the system. In practice, a functional keypad is typically adequate, but extra components can be helpful when employing the “process of elimination” to resolve problems.
Let’s put the “Known Good Components” theory to work. Suppose a site had a non-functioning keypad camera. An easy thing to try is to swap cameras with another camera that is working correctly. If the second camera also doesn’t work, and your original camera works on the other device, you can feel quite confident that it isn’t the camera that is the problem. You can sometimes avoid problems assembling a new system by testing components before beginning installation. For example, take a new keypad, loop detector, and gate and attach them to a “Known Good” Controller; if problems arise building the new system, you know it is not likely the Controller but one of the connected peripherals.
6. Do Upgrade of Assembly One Step at a Time
Changes made to the system are the most frequent cause of problems; this is the nature of change. You can avoid or detect problems with upgrades or new installations by going “one step at a time.” New system installations or major upgrades often have “difficult to diagnose” problems because so many modifications come online simultaneously.
When building a system, you will assemble many components. Be methodical! For example, it is best to ensure the primary system’s functionality when constructing a new system from scratch. Adding keypads, relay boards, door alarms, and other devices should be done separately. Similarly, do not try to do significant software upgrades at the same time that you make hardware changes. Doing this can make it very difficult to troubleshoot system problems. If you do make multiple changes at once, try retracing your steps. Undo the changes you have made once and see if you can identify the change that caused the problem.
7. Determine the Repeatability of the Problem
Most problems fall into one of two categories: either repeatable or intermittent. A repeatable problem is one where the problem occurs all the time or permanently in response to a specific user action. For example, a gate that has a problem opening will probably always fail to open, no matter how many times you reset it.
In contrast, some problems are intermittent. You may have a gate that will usually open fine, but one day a month, it will fail to operate. A keypad may work most of the time but occasionally stop accepting codes. Determining if the problem is repeatable is helpful because intermittent difficulties are much more challenging to resolve. If a problem is repeatable, and there is a specific action that causes the problem, this gives you at least some initial clues about finding the cause. Intermittent issues are much more challenging to deal with; try to duplicate the conditions that caused the problem and see if it happens again.
8. Deal with Intermittent Problems
Intermittent problems appear to happen randomly. They seem not to be caused by anything obvious and are not repeatable. They are complicated and frustrating to diagnose. Sometimes problems that seem intermittent aren’t; it’s just the specific set of circumstances that causes the problem to occur are hard to notice. Spend time determining the circumstances when the problem arises. For example, many lockup problems will occur only when the system has been used repeatedly during heavy traffic; some may occur only within the first few minutes after the facility is open. You may find that a particular behavior is associated with a specific issue.
9. Be Patient
Being patient when dealing with these sorts of issues is vital. Since the problem is not something you can duplicate at will, you may not be able to work your way toward the ultimate cause systematically. This situation involves trial and error and then waiting to see if the problem recurs. It can sometimes take days (or longer) since you have to wait before seeing if the problem happens again. Be patient.
10. Correlation May Not Imply Causation
The word correlation refers to two behaviors or symptoms that appear at the same time. Causation refers to two events where one is responsible for the other’s appearance or existence. So if you see two strange things happening simultaneously on your system, this does not necessarily mean that one of them has caused the other. They could be coincidental, or it could be that where you think A is causing B, in reality, B is causing A.
For example, you may find these issues in your system: frequent file system corruption and tenants receiving an “Incorrect Code” message. You may think that the “Incorrect Codes” are causing the file system errors, which could be true in many cases. However, file system errors can cause incorrect codes. And it is also possible that both are symptoms of another underlying cause.
Summarizing the Process
Successfully fixing any issue in your access control system requires troubleshooting. There are many different methods for troubleshooting, be it asking “Why” numerously, digging through the hardware/software for evidence, or eliminating causes one by one. No matter the method used, remember to be patient, note findings, and never link correlation with causation, even when dealing with intermittent issues. With the proper product knowledge and sound troubleshooting fundamentals, resolving “Business Affecting” issues will take no time to clear, so the access control system functions optimally.