Category: Process Control

Transferring data from an old Aveva Historian to a new Historian
The method Aveva suggests for migrating from an old Historian server to a new historian server is to transfer the old memory block files, then export the configuration on the hold historian and import it into the new historian, and from that point onwards the new historian will be the same as the old historian.

That’s a good way of doing things if you’re on a fairly clean system, but there are potential issues.

Aveva has changed the data format over the years, so you could have history in a number of formats — people who have dealt with one of the conversions know that you can end up having to use two extraction subsystems, in addition to potentially multiple versions over time for other older data.

Another issue with a highly dynamic system with lots of programming changes is that every change can cause a new “tag” to be created. Over time, the data is consistent, but you have hundreds of thousands or even millions of tags, and the historian needs to track the name changes over time. This can slow the historian when you’re trying to get longer trends and also make maintenance more difficult.

So for some people, you might want to export the data out of an old historian en masse and import it into a new, clean historian. This has been my plan for several years, but without an easy path to do so, I held off until we were also upgrading the hardware directly. From there, I could pull the data out of the old historian using the sqlcmd command, then write each tag into the Historian’s database.

Looking at the original historian, it contained 24 billion values collected over 19 years. Therefore, it could take a huge amount of time to transfer everything. Instead of pulling all the values for one tag at once, I limited the time span to a set number of days, and go through each tag. In that way, the entire historian is filled up little by little, and over the potentially weeks and months required to fill the new historian, the recent history is filled up and over time more and more distant history becomes available in case you need to get things spun up quickly.

One question I had which was answered in practice is whether the data gets dumped in today’s history file or if it gets filled in appropriate tags. In fact it turns out the historian will go back in time and create the appropriate files. One of my first successful imports was a single value from 2013 on a certain tag, and it created a new directory and tag files for that day in 2013. This is exactly what I wanted to see, brand new tags in a brand new data file.

The file format shown in the file is:
```
ASCII
|
DATA_IMPORT|LOCAL|SERVER LOCAL|FASTLOAD|DEFAULT
[Tagname]|ORIGINAL|[data point date]|[data point time]|EU|[data point scaled value]|[data point quality]
```
The standard good quality data point is 192.

So it should be easy, right? No problem!

Problem!

There’s a part of this that you can’t see. You can’t see it in most text editors.

See, there are two ways to do a new line.

In Windows, traditionally it’s been two characters, Line Feed and Carriage Return. If you think about a command line like a typewriter, then it makes a lot of sense — you have to feed the paper to get to the next line, then you need to return the carriage to the beginning to continue tiping from the beginning of the next line.

In Unix, they don’t do a carriage return, only a line feed. This makes some sense as if you’re doing a line feed you’re pretty much always starting from the next line regardless.

Later on, Windows has aligned more with Unix, which is really nice in many ways, but does mean that their tools mostly operate as if we’re on Unix.

Wonderware is a product that has existed for a very long time. It was first developed in 1987. The product line has always been highly entangled with Windows, all the way back to Windows 3.1.

So, have you figured out the problem yet?

Wonderware’s tools are looking for a line feed and carriage return, and if it only sees a line feed, it considers the line to have not ended.

Error in processing the file: [Your filename]
Index was out of range. Must be non-negative and less than the size of the collection.
Parameter name: index

Imagine spending hour after hour trying to get a simple csv file to run, only to find out that the enter key was doing the wrong thing.

The way to see what form of enter your file is using is to download notepad++ and activate the “Show all characters” feature. It looks like a backwads p with no hole and two lines. It will show you the CR characters and the LF characters.

I ended up choosing to limit the length of time to 15 days, because the number of lines is quite large — at one line per second, you end up with 3600 lines per hour, approximately 10,000 lines per tag per day.

I used the Wonderware Query tool to pull the tag list out of the new historian (to prevent pulling data for legacy tags that aren’t used any longer).

One thing I did to reduce the overall time required is I split the project into 2 separate jobs: I cut the tags into discrete and analog tags, then ran the two scripts pointing at each tag list separately. The character of the data is quite different between the two: Other than a few heartbeat tags, most discrete tags don’t change very often. They can only go from fully off to fully on, whereas analog tags can continuously change a very small amount. This means that despite historizing approximately 3 times more discrete tags, the tags pull and import much more quickly.

Here is the final program (courtesy of chatgpt). You run it on the Historian you’re loading up. It uses the sqlcmd program installed by sql server (or sql server express) to pull the data and place it into the fastload folder. If you have many many files to transfer, I’d recommend using a hard disk drive or better yet a ramdisk, because it creates multi-gigabyte files over and over again. I used the program called OSFMount from Passmark Software to create a 10GB ramdrive, which ended up being oversized for my purposes. You could do it on an SSD, but for a long-lived historian with many values you’re going to be writing and rewriting values so much you’ll probably massively reduce the lifespan of your ssd.

https://www.osforensics.com/tools/mount-disk-images.html
```
# ===========================================================
# PowerShell: Backfill Historian via FastLoad in 15‑day chunks
# ===========================================================

# --- Configurable section ---
$server           = "old historian hostname"
$database         = "Runtime"
$tagListPath      = "C:\HistorianExport\Extract\tags.txt"
$exportRoot       = "C:\HistorianExport\Output"
$chunkDays        = 15
$fastLoadFolder   = "C:\Historian\Data\DataImport\FastLoad"

# FastLoad header parameters
$fastLoadUser         = "Data_Import"
$timeMode             = 1                   # 1 = local time, 0 = UTC
$timeZoneName         = "Server Local"
$missingBlockBehavior = 10                  # 10 = original by tag name
$timeSpanReplacement  = 0                   # 0 = backfill to now

# Earliest date to backfill
$earliestDate = Get-Date "2006-01-01"

# Start from today and work backwards
$endDate = Get-Date

# --- Setup: ensure export & FastLoad folders exist ---
foreach ($dir in @($exportRoot, $fastLoadFolder)) {
    if (-not (Test-Path $dir)) {
        New-Item -Path $dir -ItemType Directory | Out-Null
    }
}

# Load tags
$tags = Get-Content $tagListPath

while ($endDate -gt $earliestDate) {
    # Determine this chunk's range
    $startDate = $endDate.AddDays(-$chunkDays)
    if ($startDate -lt $earliestDate) { $startDate = $earliestDate }

    $startString = $startDate.ToString("yyyy-MM-dd")
    $endString   = $endDate.  ToString("yyyy-MM-dd")
    $fileStart   = $startDate.ToString("yyyyMMdd")
    $fileEnd     = $endDate.  ToString("yyyyMMdd")

    Write-Host "`n=== Processing chunk: $startString to $endString ==="

    foreach ($tag in $tags) {
        # Prepare per‑tag subfolder
        $safeTag  = $tag -replace '[\\/:*?"<>|]', '_'
        $tagFolder = Join-Path $exportRoot $safeTag
        if (-not (Test-Path $tagFolder)) {
            New-Item -Path $tagFolder -ItemType Directory | Out-Null
        }

        # Paths for this tag+chunk
        $outputFile   = Join-Path $tagFolder "$safeTag`_${fileStart}_${fileEnd}.csv"
        $tempDataFile = "$outputFile.tmp"

        # SQL query: fetch data for this tag in the chunk
        $query = @"
SET NOCOUNT ON;
SELECT
    '$tag' AS TagName,
    CONVERT(varchar(23), DateTime, 121) AS DateTime,
    Value,
    QualityDetail
FROM v_History
WHERE TagName = '$tag'
  AND DateTime >= '$startString'
  AND DateTime <  '$endString'
  AND Value IS NOT NULL
ORDER BY DateTime;
"@

        Write-Host "Exporting '$tag'..."

        sqlcmd -S $server -d $database -E -Q $query -s"," -W -h -1 -o $tempDataFile

        if (Test-Path $tempDataFile) {
            $lines = Get-Content $tempDataFile | Where-Object { $_.Trim() -ne "" }
            if ($lines.Count -gt 0) {
                # Open StreamWriter: ASCII (no BOM) + CRLF
                $stream = [System.IO.StreamWriter]::new(
                    $outputFile,
                    $false,
                    [System.Text.Encoding]::ASCII
                )
                $stream.NewLine = "`r`n"

                # FastLoad header (3 lines)
                $stream.WriteLine("ASCII")
                $stream.WriteLine("|")
                $stream.WriteLine(
                    "$fastLoadUser|$timeMode|$timeZoneName|$missingBlockBehavior|$timeSpanReplacement"
                )

                # Write each data line (7 fields, 6 pipes)
                foreach ($line in $lines) {
                    $parts = $line -split ","
                    if ($parts.Count -eq 4) {
                        try {
                            $dt             = [datetime]::Parse($parts[1].Trim())
                            $datePart       = $dt.ToString("yyyy/MM/dd")
                            $timePart       = $dt.ToString("HH:mm:ss.fff")
                            $valueFormatted = "{0:0.##############################}" -f [double]$parts[2].Trim()
                            $quality        = $parts[3].Trim()

                            $valueLine = (
                                "$($parts[0].Trim())" + "|0|" +
                                "$datePart"           + "|"  +
                                "$timePart"           + "|0|" +
                                "$valueFormatted"     + "|"  +
                                "$quality"
                            )
                            $stream.WriteLine($valueLine)
                        }
                        catch {
                            Write-Host "⚠️ Bad datetime in line: $line"
                        }
                    }
                    else {
                        Write-Host "⚠️ Malformed line: $line"
                    }
                }

                $stream.Close()

                # Atomically move to FastLoad folder
                $dest = Join-Path $fastLoadFolder ([IO.Path]::GetFileName($outputFile))
                Move-Item -Path $outputFile -Destination $dest -Force
                Write-Host "✅ Moved $([IO.Path]::GetFileName($outputFile))"
            }
            else {
                Write-Host "🚫 No data for '$tag' in this chunk."
            }

            Remove-Item $tempDataFile -ErrorAction SilentlyContinue
        }
    }

    # Step back one chunk
    $endDate = $startDate
}

Write-Host "`nAll chunks complete!"
```
The end result of all this is a clean Historian with all your data (notwithstanding alarms, I’m not transferring them and I don’t really know how and didn’t really want to) — and it shows in certain easy metrics. For example, the historian configuration export for me was previously 77MB and took several minutes to export, but the new historian with the same tags took 7MB and about 20 seconds to export. That level of cleaning is meaningful for system performance, and should also be meaningful for reducing hiccups due to back-end complexity.
18 April 2025
Manage Risks, avoid the cape

I’m Jason Firth.

On Sunday evening, this very website you’re reading this post from went down. In fact, it was a critical emergency because not only did the server’s main drive fail, but there were no backups, we had no parts on hand, and no plan to recover even if there was a backup.

Have no fear, Nerdyman is here! To put on the cape and save the day!

First thing in the morning, I went out and grabbed a replacement drive from the store, and I was able to limp the drive along long enough to make a full backup, and finally ended up figuring out how to put the full backup onto a new drive and make it boot up! Hoorah! The day is saved!

There’s a big problem.

Superheroes aren’t real. They exist in fantasy. The fact that I could run out and get parts that just happened to exist in a store somewhere, the fact that I just happened to figure out how to get my data back, the fact that I just happened to figure out how to get the system back up with the backup data and do it in time that I could get it done was all pure luck.

Let’s say this wasn’t a personal webserver with pictures of nerdy superheroes on it, but a business critical server. It was 24 hours from failure to recovery. For a personal webserver with pictures of nerdy superheroes on it, that’s pretty decent turnover time, but for a business critical application that’s totally unacceptable, especially given that successfully repairing the server was not a given.

In a business environment, you need to manage your risks. This is going to mean a few things: First, taking an inventory of every device you’re responsible for. Second, asking “What happens if this fails?”. Third, asking “How do I repair this when it does fail?”, and finally asking “Do I have the hardware, backups, and recovery procedures in place so we can get this back up and running reliably and quickly?”, and finally coming up with a plan to make sure you have the hardware, backups, and recovery procedures in place for everything you discovered in your audit.

When you manage your risks like that, there is no such thing as having to put on a cape, because you don’t need to make a miracle happen — you just need to follow your plan.

So what should I do for the future? To start with, I should have a spare drive on hand (and perhaps even a full spare computer if this is really critical). Next, I should have 3 copies of my data: The data itself, the online backup, and the offline (off-site if you’re really paranoid) backup. Third, I should have procedures in place to restore the backups in the event I need to (it’s occurred in the past that backups were made for a machine but the backups could only be used on that exact hardware, so massive amounts of time and effort were used to put the cape on and make the backups operate on hardware it should never have operated on). Finally, I should be testing my procedures occasionally to ensure they are relevant and reviewing my site-wide audit regularly.

One of the interesting things about proper risk management is it not only affects things directly that come from the document, but the process of asking these questions will change the way you do things. You’ve gone through everything you have, and inconsistencies become clear, and the consequences of cutting corners become more obvious.

Risk management is an extremely boring task. It means sitting down perhaps for weeks or months and spending a lot of time asking the same questions over and over again, but it pays off the first time you’re ready.

So I’ve been talking about IT infrastructure, but consider this: What I’ve said can correspond to all the equipment you’re responsible for. If you’re responsible for an instrument, shouldn’t you have a spare part, backup configuration, the tools you’ll need, and a procedure for repairing it if need be?

Doing this risk management totally changes the character of daily maintenance. Doing the work up-front means that when something happens, you put the cape away and just get about doing the work you need to do (and if you’re a supervisor or a manager, it means your workers put the cape away and get about doing the work they need to do). It costs less in the long run because you aren’t rushing parts in. It lowers mean time to repair because all the legwork is already done. It means less downtime because you don’t need to rush in parts, figure out how to do things on the fly that have never been done before, or figure out how to deal with your lost data in the middle of a crisis.

Thanks for reading!

17 January 2024
Work From Home does NOT mean Work 3 jobs.

During the pandemic, a certain practice ended up being written of in national media and became a topic of discussion, though I don’t know how prevalent it became in fact. This was the practice of working several jobs from home. In particular, I’ve heard about it quite a bit from IT people.

The individuals who took these jobs were accepting pay by the hour, but then would take several of these positions at once.

There are a number of rationalizations, including that as long as that job duties are getting done, It isn’t really your employer’s business whether you have another job during the same hours they are paying you for, the need for financial security, the lack of opportunities, the exploitation by employers, and the autonomy of workers.

I’ll say, in an age of high inflation, high cost of living, and relatively wages compared to productivity, I can relate. I understand why people would think that this is a just thing to do.

But here’s the thing: It’s not just, it’s not ethical, it’s not moral,it’s not legal. Don’t do it. (Don’t worry, I’ll tell you how you can do it ethically and morally and legally at the end)

The argument of “As long as job duties are getting done, it isn’t really your employers business whether you have another job” makes a lot of assumptions that aren’t necessarily valid. Just because you are getting paid at a job and you haven’t been fired yet does not mean you’re doing perfectly fine and everything is okay. Labor law makes certain assumptions about the employer employee relationship and as such a Grant’s additional protections that someone doing peace mill work wouldn’t have. It’s often very difficult to fire an underperforming worker, and so management will keep them on even though they’re not particularly happy with the job they’re doing. They may even tell the employee that everything is just fine. In that case, they’re not doing it because the employee is hitting all of his targets and everything is fine, they’re doing it because the cost and effort involved in trying to get rid of a bad employee is so high it’s easier to just deal with an underperforming team member. At that point, then you have that underperforming team member go off and work three other jobs because they are under the impression that they are a Superstar who hits all their targets. In reality, that worker is just taking advantage of both sides of the equation, working like a subcontractor well accepting the benefits of full-time employment and the protections therein.

When an employer is paying you for that hour, it is your employer’s business what you’re doing. The moment that you accept payment for that hour of your time, it is no longer your time, it’s their time to do with as they will within reason.

One can make the argument for financial security, but I don’t really feel like that’s an ethical argument as much as a practical one. Of course we all want to make more money, and if there’s a way to make more money then you’re making more money, but that isn’t a moral or ethical argument. Robbing old people is another way you could make more money and be more financially secure, but most people would agree that robbing old people isn’t moral or ethical. For something such as a highly paid information technology position, there are lots of people who end up trying to make it without highly paid information technology positions, and while it may be a struggle they do successfully accomplish it. Moreover if you are taking two or three full-time jobs, then that means one or two other people who also need to make ends meet suddenly can’t get those positions because someone else has them.

Complaining about a lack of opportunities while taking up several opportunities just seems hypocritical to me. Several people could have taken up those jobs, and instead one person is going to do a half-ass job at a bunch of them. Again, this isn’t an ethical argument, it’s a practical one.

There’s an argument that workers are being exploited and therefore the workers should exploit right back. This isn’t the sort of argument that we want to be making. The company is paying you have all the power in the world to make everyone’s life a lot worse because a minority of people are trying to scam them. A race to the bottom is not good. If you feel like you’re being exploited, then it’s time to take measures to stop being exploited, that doesn’t justify exploiting others. Moreover, if it becomes common practice for workers to be working several jobs at once, this doesn’t actually make the workers less exploited. Effectively, it will artificially drive up the supply of labor, meaning that companies won’t have to compete as hard for workers, meaning that the labor market will become more exploitative. As well, the people who are working two or three jobs won’t need to be making enough money to survive because they’ve got two or three jobs, meaning but the people who are doing this are effectively making it harder for people who aren’t doing this to survive on one job, again ensuring that the labor market becomes more exploitative, not less.

Finally, as I’ve already said, the autonomy of workers is a moot argument because once they are paying you for your time it is not your time to be autonomous within. The more people try to scam the systems we live in, the more those systems are going to push back and workers who are otherwise being honest are going to face less autonomy because the employers are going to feel that they can’t trust their workers if they aren’t being micromanaged.

There’s another major issue, and that’s not working from home is already considered an extremely privileged position. People who work from home don’t need to commute, they’re working conditions can be whatever they wish them to be, including listening to whatever they want on the radio, watching YouTube videos while working, even having a beer in the backyard while attending meetings as long as you don’t get caught. So you have these people who are some of the most privileged in the entire job market, and they take this privilege and they end up using it to take yet another privilege, this ability to work two or three jobs at the same time that people who are still in the office don’t get. In terms of fairness, fairness is completely out the window once we’re talking about disparities like that.

I would suspect that over time, if this became a widespread practice then it would just mean the end to work from home. Employers aren’t employing people so that they can be number three or number four on the list.

Consider a loan you take out yourself. You can take out a variable rate loan, or you can take out a fixed rate loan. If you take out a variable rate loan and rates drop, then the bank passes the savings on to you, but if rates rise, the bank passes the costs on to you. If you take out a fixed rate loan, then if rates drop the bank benefits, but if rates rise the bank takes that risk. For that reason, the bank charges a bit more for a fixed rate loan, because they’re taking on the risk of interest rates rising. The longer the rate is locked in, the larger the premium you pay because the bank is taking on more risk that interest rates rise. So imagine that you were in a situation like that, and you had a variable rate mortgage and interest rates dropped and the bank refused to register the drop in interest rates. In that case, you were spending less money so that the bank would be taking less risk but it broke the deal. How about if you had a fixed rate mortgage, and rates continue to rise, and the bank refused to honor your interest rate and just increased it because it wanted to. During the good times you had paid a premium in order to have that fixed interest rate, and they stepped in and broke the deal.

Let’s say that you were hiring someone to build a deck. And you paid them time and materials to build that deck. Let’s say that you were home for a full day, and you knew full well that they weren’t at your house building that deck, but then you saw an invoice come in for that full day of labor. Would you be okay with that? How about if you saw the crew that you were paying for at another place building someone else’s deck? How about if you found out that on that day they were charging three other people to build their decks, but only one of them actually had the crew at their house building their deck. Would you accept “I don’t know what the problem is, your deck got built didn’t it?” — I think most people would be in court and rightfully so. That company that said that they would be building your deck, that charged you to build your deck that day that wasn’t their building your deck committed fraud. Let’s say that the entire crew is just sitting there waiting for the cement truck to arrive. And you’re paying the entire crew to just sit there. That’s what they’re getting paid for. Just because they’re not doing anything doesn’t mean that they can go off and work on someone else’s deck on your dime. If they go off to work on someone else’s deck, that person can pay for the work crew..

So let’s say that you agree to a company to get paid a certain amount of money each day to be available for a certain number of hours that day to work. Whether you are productive or not you make the same amount of money. The company is taking the risk here. The hours that you are productive, they pay you. The hours that you’re completely unproductive, they pay you. If there’s nothing to do but stand around, that’s what they’re paying you to do. If you want to go work for someone else during those hours, you should turn off the payments from the one employer and turn on the payments to the other employer. Of course, that isn’t how most employment contracts work because for jobs the employer wants your time and is paying for it.

So how do you do this morley, legally, ethically, etc? It’s really straightforward. Be honest.

If you want to establish a contract with an employer where you’re going to be working for several employers, that’s something you’ll have to negotiate. You can set it up where there are certain performance guarantees they’re paying for in maintenance work such as keeping KPIs above a certain level, and if it takes less time to achieve those KPIs you come out ahead and if it takes more time to achieve those KPIs you come out behind and are likely charged a penalty. You can set it up where you’re charging by the hour, and if you’re charging another employer for an hour you don’t charge the employers you aren’t working at for the same hour. This will mean that your employer is no longer paying you a premium for the risk taken of you not being productive in a certain hour, and either you’re taking the risk that you need much more time to achieve your KPIs than you expected, or you’re taking the risk that you don’t have anything to do so you can’t charge any of your employers for your time.

If you’d prefer keeping the employers separate, then you’ll have to do something like staggering the hours so you work 8 hours at Employer A and 8 hours at Employer B. Notwithstanding any clauses in your employment contract preventing you from moonlighting, this is usually totally acceptable (but it’s also a recipe for burn-out), and you get to keep the benefits of the company taking on the risk of time you’re getting paid but not being productive for them.

This isn’t a novel problem, it only appears as such to people working in jobs who have an opportunity to deal with multiple clients at once. It’s a problem professionals have had to deal with forever. If you’re a contract lawyer or engineer working in a central office you operate independently of any particular client, you might have many clients. It’s written into the codes of conduct for these professions that you must bill fairly, and if you double bill for the same hour you’ve committed a crime and will be punished both by the law and by your professional regulator.

Let’s go back to our deck example. If you agreed at the outset to pay a fixed fee for the deck installation within a certain period of time then it doesn’t matter where the work crew is or who it’s doing work for because you’re not paying for the time the deck builders spend on your job, you’re paying for a deck and it’s their risk to make sure they do the job on time and on budget or they’ll take the hit. Otherwise, they could charge by the hour (or even by the 15 minute chunk) and meticulously ensure that only one client was being charged for time at one time. In that way, the deck builder could get paid ethically and legally. It would benefit you during times like waiting for the cement truck because someone else would be paying for the work crew at that time.

Of course, all this could get turned on its head if you manage to find some employers who are willing to pay you an hourly wage and are ok with you double charging by having multiple jobs. You’ll want to get such an unorthodox agreement in writing from all the parties involved, because it’s highly unusual and a simple verbal agreement isn’t likely to hold up in court if the employer/employee relationship goes south and they decide to sue you for breach of contract.

There’s another issue you need to be aware of: Depending on your position (and typically this only applies to high level executives or managers), you may have a fiduciary duty to the companies you work on. That means that you need to put the company’s needs above your own. In such a case, you may not be able to be employed by both a company and its competitors and meet that obligation, and that might not be something you can sign away.

These concepts apply to engineering technologists and tradesmen equally, though most tradesmen are not held to any sort of professional code of ethics, they are held to laws against fraud. It’s something to keep in mind because if unethical practices become common, there will be consequences as companies strive to protect themselves from fraud.

23 October 2023
Happy new year, and some thoughts on AI

I’m Jason Firth.

A lot has been said about AI of late, the idea that AI could “evolve” and become a true intelligence.

With respect to any system I’ve seen so far, that sounds plausible, except that it isn’t. Much like a stuffed doll will never evolve into a human because the fundamental stuff that makes up both are completely different, “Artificial Intelligence” and actual intelligence are made up of things that are absolutely different. Take the blinker on your car. Is there a chance it will ever become truly self-aware? The answer is clearly “no”. It is a switch and a timer circuit and a couple light bulbs, designed by a human being to do one and exactly one thing. How about if you replace the switch with a voice activation? It’s still a switch and a timer and a couple light bulbs. How about if you have a wave file play saying “Activating turn Signal” when you turn it on? You’ve made the interface more human compatible, but the fundamental stuff that makes it up is the same. It’s still a timer circuit at its root. The logic behind these three “Artificial intelligences” are very similar. Every single AI I’ve ever seen is a more complicated version of the exact same concept as the turn signal. It’s a purpose built machine made for excelling at one task. Deep Blue can beat Garry Kasparov a thousand times at Chess, but it will never beat him at Mario Kart unless a human intervenes and effectively creates a new device for beating Mario Kart. It will never beat him at writing a song unless a human intervenes and creates a new device for writing songs. It will never beat him at writing poems unless a human intervenes and creates a new device for writing poems. By contrast, the human mind invented chess, and Mario Kart, and songs and poetry, then determined what was a good state and a bad state, and then determined methods to get to the good state. The human mind wired itself to do this, along with a thousand other things that were required to get there. No human ever opened up someone’s brain to wire up a chess player or a mario kart player or a musician or a poet. As long as that distinction exists, artificial intelligence will always actually be a mere tool created by a human intelligence. Such intelligence “evolves” by the humans getting smarter and applying different algorithms to different problems successfully, not by solving problems itself. The day that an AI identifies and quantifies a problem then comes up with a solution on its own, that’s the day I’d be concerned about a true intelligence developing. Until then, it’s purely science fiction. Thanks for reading!

28 January 2019
Too many software engineers

I’m Jason Firth.

At one point, most instrument software was written by instrument specialists. As a result, it was small, simple, and specialized.

This was good. Specialized software may be ugly, but it works, and is often designed with specific technician use cases in mind. Moore’s software packages for communicating with their smart instruments are a good example: with one utility and an RS-232 cable, you could configure, query, troubleshoot, and test an instrument.

Software engineer types would rather create flexible platforms to develop software on top of.

Now, that is a very legitimate desire. If you can build that one tool, then you can make it easier to access a large number of devices with one tool, and you simplify the development process for vendors, who can focus on the job, rather than the surrounding elements.

There are problems with creating a flexible platform if it isn’t done carefully.

One problem with this is that flexible platforms add complexity for end-users. PactWARE, for example is a marvelously flexible piece of software. It allows you to not only use a number of point-to-point hardware devices such as HART or Endress+Hauser’s proprietary communication cable with a simple swap of the DTM; it also allows you access every single device in your plant using multiplexers, or to access IP HART devices. The problem is that all this flexibility is extremely cumbersome to navigate.

ProComSol Devcom2000 is a piece of software that only speaks HART through a serial port or USB modem. Its simplicity allows a technician to connect their modem to the PC, connect the modem to the HART instrument, and run the program. The software will immediately connect to the instrument, after which you are ready to go.

By contrast, PactWARE requires you to connect and start the software, followed by installing the HART communication DTM, followed by configuring the module, followed by opening the autodetect window, then running a scan, then selecting the correct DTM, then closing the autodetect window.

This is one example of one software package compared to another, but the fact is that there are countless examples of software trying to do everything, only to be less useful as a result. I try to design anything for that 2am call, and a ridiculously complicated tool isn’t conducive to this.

Another issue with creating these extremely complicated software programs is a simple truism: the more moving pieces you have, the more things there are to fail.

I really like Wonderware Historian for its standard SQL front-end and tight integration with wonderware and application server, but it is a complicated beast. IO Servers speak to the PLC, the Historian mdas service communicates with the IO Server, (assuming the import went correctly), which communicates with the historian storage service. To retrieve, the SQL service links up with the retrieval service. There’s other services involved as well, but the bottom line is that any one of these parts is huge and complicated, and all it takes is one cog in one of these giant machines to break.

Historian isn’t even the worst. I’ve seen instrumentation configuration utilities that require always-on services to be running, a full SQL server instance — just to shoot a couple bits over an RS-232.

The problem is that conceptually on a software development level, using these tools makes a lot of sense — you’ve got this big fancy development package and this big fancy software engineering degree, of course you’ve got to make use of them! A small, simple program that does one thing very well, that’s not going to work. We need a platform. Something that can handle all use cases. Something that can keep me employed fixing it!

Thanks for reading!

27 May 2018
An Uber fatality and the limitations of automation — and the amazing powers of your human operators

I’m Jason Firth.

Recently, there was a fatality in the news, as an Uber automated vehicle hit a pedestrian who was crossing the road.

The circumstances seem to be that the pedestrian carrying a bike crossed the road without seeing the car, and the car basically drove right into the young woman.

A lot of people seemed shocked that the car didn’t recognise the young woman was there, and didn’t immediately brake or swerve. One person invoked “fail safety”, the idea that equipment should always default to the safest state.

This case is, in my estimation, more complicated than you’d think. It’s true you want things to fail safe, but it isn’t always clear what a fail safe state is.

I’ll give you an example.

In a commonly told but apocryphal story, boiler maintenance was being done at the paper mill in Fort Frances Ontario Canada. (You’ve heard this story from me before) The design of a boiler (at least a recovery boiler like this) is you have tubes filled with water and steam surrounding a combustion chamber. Usually, you’ll have a drum called the mud drum that contains a certain level of water. If that level is too low, that’s normally considered an emergency situation. In this case, the maintenance they were doing required the mud drum to be empty, and they were still firing the boiler.

The story goes, a new operator came on shift and saw the mud drum was empty and immediately panicked. The operator immediately opened the water valves wide open (what would normally be considered ‘fail safe’), and the boiler immediately exploded.

Why did that happen? What happened is the boiler tubes were red hot and virtually unpressurised. When cold water hit the tubes, the water immediately caused an explosive release of steam which caused an explosion. While the involvement of a person is unusual, boilers routinely experience explosions due to water valves having problems like this. If the boiler was running under normal conditions, perhaps dumping water into the tubes would be a safe option — cooling everything and getting everything to a zero energy state faster.

So despite the valve opening being what you’d normally consider a ‘fail safe’ state, in this case it was a dangerous action to take.

Let’s assume for a moment that both the car and the driver had perfect vision in that moment, and saw the pedestrian long before the moment of impact.

What is the safest action to take if you see someone crossing the street? Everyone here is immediately saying “obviously slam the brakes and swerve!”, but let’s think about that for a second. Most people are not going to walk directly into the path of an oncoming vehicle. Even if crossing, you’d expect a person to stop, so you can’t necessarily use the fact that there’s a person there to predict what’s going to happen. By contrast, what happens if you slam the brakes and swerve every time you see someone crossing the street a little too close? If there’s a car near you, it could cause an accident. If the person was going to stop, then that person could end up getting hit by your actions where they might not otherwise. The driver or passengers in the car might be injured — probably for nothing, because 99 times out of 100, the person will stop before the car hits them. Often, the safest act is to do nothing.

Here’s where there is a divergence between the powers of an AI, and the powers of a human. An AI sees object 15 — perhaps even a human on bike type object — travelling at a certain speed at a certain vector. It has to figure out what it can from relatively limited information. By contrast, a human sees a sketchy looking lady walking in a strange way not paying attention. The AI might not recognise there’s a threat, whereas the human might recognise something isn’t right and take the opportunity to take some of those more aggressive defensive manoeuvres for this isolated case. It isn’t just object types and vectors, it’s a vivid world of information and context.

Our powers of intuition, empathy, and deduction are much more than we give ourselves credit for. We know more than any purpose built AI, and can make connections that no purpose built AI presently can. Humans aren’t perfect, but there’s reasons why we still have humans involved with even the most high tech processes.

It’s ironic to say this as an automation guy, but the world is about to realize the limitations of automation, as it comes closer and closer to our personal lives.

As interesting as this story is on it’s own, I feel it’s also interesting to show the limitations of raw automation in the industrial context as well. Sometimes, operations asks for a system that reacts to something the human knows but the machine does not. If you’re not careful, you cause false positives and react dramatically to situations that don’t exist based on assumptions, causing more problems than you’d prevent.

One I saw for a while was an operator pointing to a spike on a graph and going “That’s because of [event], we need to prevent that.” Then you’d go down the graph and find another spike and go “Is this [event]?”, they’d say “no”. You’d go down a little further and say “how about this? Is this [event]?”, and they’d say “no”. It turns out that the reason the operator knows what’s going on is that the operator is a human with eyes and ears and an incredibly versatile mind that can understand things far beyond a series of numbers plotted along a graph. Short of dramatic changes to the process, the PLC can’t know that [event] has occurred with any sort of certainty.

Thanks for reading!

22 March 2018
Blue skies, green Fields

I’m Jason firth.

One commonality I notice when people ask me to help solve a problem is that quite often they explicitly limit solutions to “what sort of control systems can we install?” Type queries.

I immediately force myself to ignore the question as presented, because of the limits it puts on the creativity we can use to solve problems.

Occasionally, we can introduce a new and innovative control system to solve a problem, but just as often, we need to take a step back and re-examine the problem. Sometimes we can solve a problem by providing more data to operators, or by making it easier to follow procedure using their current user interface. Sometimes we need to inform rather than control. Sometimes we need to analyze in a new way. Sometimes it’s a maintenance problem and fixing a chronic problem will help. Sometimes there’s no problem at all and things must be operated on a certain way for safety or operational reasons.

By looking at problems outside of their ostensible technical scope, we can see the systems involved. We can ask questions we might not have asked otherwise: systems involve processes, equipment, operators, procedures, user interfaces, and control systems. Sometimes the answer comes from looking at the whole picture rather than a small piece.

Looking at problems this way also provides new opportunities. A few years back, I was asked to investigate problems with a certain Historian in gathering process critical data. What I discovered was that we were asking the historian to do something incompatible with its design. Historians consist of dozens of working parts, all of which need to function for data to be saved and retrieved. Instead of fighting the historian to conform, we created a new system which consisted of a single simple program with one purpose. Instead of requiring dozens of systems to work, suddenly we only needed two: retrieval and storage. Once we created this new system, we were able to extend it to automatically produce files for regulatory reporting — an unexpected boon which saved the site time and increased accuracy.

This provides new opportunities for a shop. Many people want their shop to limit its influence to “what control systems can we install”, but by looking at a strategy which embraces increased responsibility and increased work in service to other groups, new opportunities arise, because it’s all connected.

Everyone wants to find a new and innovative and cool control system, but sometimes you need to step back from that well trodden lot, and look at the areas nobody is looking, where there are blue skies and green fields, waiting for someone.

Thanks for reading!

24 August 2017
All you need to know about PID controller

I’m Jason Firth.

I recently commissioned this article explaining the function of a PID controller by freelance writer Sophia O’Connor. It’s one of a few pieces I’ve commissioned recently. It’s partially a test to see how well commissioning freelancers can work, and partially a public service to get some stuff written about some basic concepts. Enjoy!

A proportional integral derivative (PID) controller is an instrument that is used mainly in the industrial control applications. PID controller involves three controllers i.e. p-controller, D-controller and I-controller. All these controllers are combined in a way that they produce a control signal. The main purpose of using a PID controller is to control the speed, temperature, pressure, flow and other variables that needs to be processed. It can be installed near the control regulation devices. Moreover, a PID controller is monitored through an SCADA system.

Working of a PID controller:

As explained above, a PID controller involves the working of three different controllers that are combined together to perform different tasks. The main purpose of installing a PID controller is to control the operations. Although a simple machine with the ON and OFF option can be easily used for this purpose. However, when it comes to something complex, the only thing that can be used is the PID controller. It will provide with the maximum opportunity to control the overall system.

A PID controller is responsible for the controlling of the output. Moreover, the desired output can also be achieved with the help of this. The three basic controls have their own working in the PID controller, they all work together to achieve a common goal. The working of these controls is explained below:

Functions of the Proportional controller:

P-controller is responsible for providing the output that is required. The output that is achieved is proportional to the current error value. The main working of a P-controller involves the comparison of the desired set point with the actual value or the value that is achieved through the feedback process. So, if the error value of this controller is zero, the output value of the controller is also zero. Moreover, this type of controller requires a manual resetting every time.

Functions of the Derivative controller:

The requirement of the controlling system involves the prediction of the future behaviour as well. This will not be done with the I-controller. D-controller is the one that will solve this problem. The output value of this controller is dependent on the rate of change of error with the time. It works as a kick start for the output system hence increasing its system response.

Functions of the integral controller:

There are certain limitations with the p-controller that are fulfilled with the help of I-controller. It is needed in this controller system because it will provide with necessary actions that are required for the elimination of the steady state error. It is responsible for integrating the error for a period of time so that the error value reaches to zero value.

All of these controller works together to form a perfect controller that can be used in the process control application.

Thanks for reading!

27 February 2017
Therac-25, a study in the potential risks of software bugs

I’m Jason Firth.

It’s unfortunately common to find that people don’t appreciate the risks involved with software, as if the fact that the controls are managed by bits and bytes changes the lethal consequences of failure.

A counterpoint to this is the Therac-25, a radiation therapy machine produced by Atomic Energy of Canada Limited — AECL, for short.

The system had a number of modes, and while switching modes, the operator could continue entering information into the system. If the operator switched modes too quickly, then key steps would not take place, and the system would not be physically prepared to safely administer a dose of radiation to a patient.

Previous models had hardware interlocks which would prevent radiation from being administered if the system was not physically in place. This newer model relied solely on software interlocks to prevent unsafe conditions.

There were at least 6 accidents involving the Therac-25. Some of these accidents permenantly crippled the patients or resulted in the need for surgical intervention, and several resulted in deaths by radiation poisioning or radiation burns. One patient had their brain and brainstem burned by radiation, resulting in their death soon after.

There were a number of contributing factors in this tragedy: Poor development practices, lack of code review, lack of testing, and of course the bugs themselves. However; rather than focus on the specifics of what caused the tragedy, what I want to show is that what we do is not just computers — it’s where rubber meets road, and where what happens in our computers meets the reality. People who would never dream of opening a relay cabinet and starting to rewire things would think nothing of opening a PLC programming terminal and starting to ‘play’.

Secondly, part of the problem is people who didn’t realise that they were controlling a real physical device. There are things to remember when dealing with physical devices: For example, that no matter how quick your control system, valves can only open and close so fast, motors can only turn so fast, and your amazing control system is only as good as the devices it controls. Because the programmer forgot that these are real devices, they forgot to take that into account, and people died as a result. This holistic knowledge is why journeyman instrument techncians and certified engineering technologists in the field of instrumentation engineering technology are so valuable. They don’t just train on how to use the PLC, they train on how the measurements work, how the signalling works, how the controllers work (whether they are digital or analog in nature), how final control elements work, and how processes work.

When it comes to control systems, just because you’re playing with pretty graphics on the screen doesn’t mean you aren’t dealing with something very real, and something that can be very lethal if it’s not treated with respect.

Another point that’s near and dear to my heart comes in one of the details of the failures: When there was a problem, the HMI would display “MALFUNCTION” followed by number. A major problem with this is that no operator documentation existed saying what each malfunction number meant. I’ve said for a long time in response to people who say “The operator should know their equipment”, that we as control professionals ought to make the information available for them to know their equipment. If we don’t, we can’t expect them to know what’s going on under the surface. If the programmer had properly documented his code, and properly documented the user interface, then there may have been a chance operators would have understood the problem earlier, preventing lethal consequences.

Thanks for reading!

full report

6 December 2016
New Software Release: Schneider Unity Pro 9 — I mean 10 — I mean 11!

I’m Jason Firth.

You will recall that last year I posted about the release of Unity Pro v8.1.

Well, I got an email from a vendor this week about the opportunity to learn about the new version of Unity Pro — Version 10!

(Wait, did I miss something? What happened to 9?)

I have absolutely no idea what happened to 9. They skipped it. Maybe to keep on track with Windows?

Unity Pro V10 supports

M580 features

*CCOTF(Configuration change on the fly) on M580 Local IOs

*Cybersecurity: Events log, Data Integrity, Enable/Disable Services

*System Time Stamping of Application Variables

*Device Integration: Network Manager

Quantum Platform Features

*HART on X80 remote drops

*New Quantum firmware v3.3

Full Excel Import/export tool

Audit Trail

*Log in Syslog

ANY_BOOL Data Type

Supports:

Win 7 32 and 64 bit;

Win 8.1 32 and 64 bit;

Win 10 32 and 64bit and;

Windows Server 2012.

How exciting, right? Well, I went onto the schneider-electric website to download the latest version, and was shocked to discover that version 10 isn’t the latest version!

Yes, there’s a vesion 11, out right before Christmas.

Unity Pro V11 support new Modicon M580 Controllers :

Support new Modicon M580 High End

Support new Modicon M580 HSBY CPUs

Support LL984 language on Modicon M580

Quantum Ethernet IO drops is now supported on Modicon M580

Supports:

Win 7 32 and 64 bit;

Win 8.1 32 and 64 bit;

Win 10 32 and 64bit and;

Windows Server 2012.

To be honest, these seem like awfully incremental improvements to be justifying major sofware number increments.

Alongside Unity Pro v10, there are new firmware images for the Modicon Quantum CPUs, and a major revision of the M580 firmware images for all the new features.

Alongside Unity Pro v11, there are new firmware images for the M580 platform.

Thanks for reading!

14 January 2016