Where Legacy Systems Are Still Used and IT/OT Convergence
-
I work as a process control specialist in mining, part of what I do is OT or operational technology, and I am taking the A+ certification because there is a concept of IT/OT convergence. One thing I will say, for anyone looking to support industrial systems, is that some of them can be quite antiquated. For example, some industrial systems still operate on DOS or Windows XP, and some programs still use VB6. This isn't generally considered an issue as the old philosophy was complete segregation of networks, so these networks will never see the internet and there is no possibility of someone coming in remotely. The reason these industries don't upgrade is because it is very expensive. Why? well if you have an industry that makes $1m a day and you want to upgrade the control network and components you may be down for a week. Then everything has to be tested and essentially recommissioned. The process could take a very long time. Companies won't see the value in that if the system they have, which is old, still works. Upgrades rarely go off flawlessly, and a lot of thought has to go into it on a live process system.
However, modern day systems operate on current OS and there is a great deal to consider when creating paths between a plant LAN and business LAN. One of the main points of IT is security, and the loss of availability of a system can be tolerated in order to keep it secure. On the other hand, in OT the most important point is availability, yes security is a top priority, but imagine if you have a massive industrial boiler and you lock the operator out from it because of a security concern and they lose the ability to shut it down safely or monitor it. Security protocols implemented on the OT system must not interfere with the essential and emergency controls of an industrial automation control system (IACS). So, a lot of care has to be taken when approaching any connection between the two networks.
Another consideration that has to be made is that windows updates on the IACS servers cannot be implemented whenever. The vendors of the control systems thoroughly test the windows patches and create modified version compatible with their control system. These updates must be done in a very planned and deliberate manner. Consider if you did an update and it went south causing the database for the control system to corrupt, and I've seen it happen, then what was supposed to be a short update turns into a week long scramble to get the system back to working condition. seven days at 1 million a day is a lot for a company to swallow.
Anyone going into IT for an industry where control systems are present, and where they will be assisting the control network team, should read more about IT/OT convergence. Understanding that the thought process of an IT specialist and an OT specialist are different will help you in communicating and supporting the OT team. In my case I am on my own in my department and rely heavily on IT for support. In other cases, IT has nothing to do with the OT system. Generally speaking, best practices is that the two are separate, but with modern technology and IT/OT convergence, this may no longer be the case and we must rely on each other and work together for the common goal.
-
@Michael-Sovereign thank you very much for a perspective on the industry and the great information!
-
Lol, to add to this. I am on the troubleshooting methodology episode right now. Now as a maintenance electrician and an automation tech I have to do a tonne of troubleshooting and thought I would share a recent experience.
We had a controller in the field that was losing comms with associated I/O. I established a theory based on logs and it came down to three components. The first was a patch panel, which is technically not supported by our vendor for use. The second is the controller rack, where the controller is mounted and connects to the system. The third was a terminating resistor. All three of these could have been the issue.
However, in the next step, testing the theory, we had to take a different then recommended approach. This field controller is a critical component for the process plant, when it losses comms we lose our mill. The loss of the mill costs big money for each minute its down, and further consideration needs to be made for the downstream process which gets thrown out of equilibrium. So even if the mill goes down for 10 minutes the process as a whole may take several hours to return to full health.
Given these considerations, as nice as it would be to test each theory individually, it can be a luxury in a live plant environment. We ended up changing out two components and moved the ethernet from the patch panel directly to the copper port integrated into the new controller rack. Whereas before it got patched then went to the old copper port on the old rack.
This solved the issue, but we have no idea which one it was. But, this is just an example of how we have to adapt our methods to the reality of our workplace. I hate leaving it like that, and if I had time I would test each theory out individually, but the cost of a few components is miniscule compared to the cost of downtime. (the components aren't cheap, but 5 minutes of downtime is more than 10x the cost of the components).