Auditing Expressions

I needed to be able to show what variables were evaluating to

Background

Working for a bank, there is one rule about how things should be done: Auditing, Auditing, Auditing. They like to know what happened. They like to know when it happened. They like to be able to look this data up after the fact, sometimes much later. We run a certain process on Azure Functions that takes input data from our servicing providers and adds calculations depending on various things: rates, history, etc. We previously stored only the output value for each of these calculations and I was asked if we could begin logging what formula was used to create each result in addition to what the values were going in to each create each property.

Scoping the Change

Suppose I have a basic property bag object for a loan, something like:

$loan = [PSCustomObject]@{
    "Loan Amount"=100
    Rate=6.25
    Term=60
    Fico=725
}

Suppose I have a couple other objects that we use. Maybe a variable with the holding period of days for the loan before it is fully originated, or a hashtable that defines something related to the length in months of the loan. Stuff like:

$holding = 3
$termRates = @{
    12=6
    24=5
    36=4
    48=3
    60=2
}

In our code, we are likely to use stuff for each of these for multiple of the things that we calculate and add to that property bag:

$loan | Add-Member -NotePropertyName Interest -NotePropertyValue ($loan."Loan Amount" * $loan.Rate / 100 * $holding + $termRates[$loan.Term] * $holding)
$loan | Add-Member -NotePropertyName FinalAmount -NotePropertyValue ($loan."Loan Amount" + $loan.Interest)

Note that these formulas are completely non-sensical, they are just demonstrating the types of data that go in to each expression.

The goal would be for other departments to be able to come through after our month or year end audit and be able to see that this loan had its Interest field using this formula: $loan."Loan Amount" * $loan.Rate / 100 * $holding + $termRates[$loan.Term] * $holding and that the values were as follows: $loan."Loan Amount"=100; $loan.Rate=6.25; $holding=3; $termRates[$loan.Term]=2; $holding=3

Possible Solutions

Time to explore a couple approaches to making the code visible in the database and the values used to calculate the result!

Manually

I could attempt to manually list each of these fields into a single property and add that to the loan object along with a copy of the expression. Doing it JUST for the Interest value, I would need to add code like this:

$values = '$loan."Loan Amount"={0}; $loan.Rate={1}; $holding={2}; $termRates[$loan.Term]={3}; $holding={2}' -f @(
    $loan."Loan Amount"
    $loan.Rate
    $holding
    $termRates[$loan.Term]
)
$loan | Add-Member -NotePropertyName Interest -NotePropertyValue ($loan."Loan Amount" * $loan.Rate / 100 * $holding + $termRates[$loan.Term] * $holding)
$loan | Add-Member -NotePropertyName InterestValues -NotePropertyValue $values
$loan | Add-Member -NotePropertyName InterestExpression -NotePropertyValue '$loan."Loan Amount" * $loan.Rate / 100 * $holding + $termRates[$loan.Term] * $holding'

And everytime that I update any of the expression, I need to remember to update the InterestExpression field and the values string. That’s duplicating a lot of work and surprisingly error prone. The actual calculation isn’t necessarily the same code as what’s listed in the InterestExpression value! That’s not good.

Using PowerShell Abstract Syntax Tree (AST) to Tokenize

PowerShell has the capability to inspect code blocks and tell me about what is in them. This is through the PSParser’s Tokenize method. As demo’d from powershell.one, tokenizing a scriptblock like $loan."Loan Amount" * $loan.Rate / 100 * $holding + $termRates[$loan.Term] * $holding is as simple as

$sb = { $loan."Loan Amount" * $loan.Rate / 100 * $holding + $termRates[$loan.Term] * $holding }
$errors = $null
$tokens = [System.Management.Automation.PSParser]::Tokenize($sb, [ref]$errors)
$tokens | Format-Table

The result being:

Content         Type Start Length StartLine StartColumn EndLine EndColumn
-------         ---- ----- ------ --------- ----------- ------- ---------
loan        Variable     1      5         1           2       1         7
.           Operator     6      1         1           7       1         8
Loan Amount   String     7     13         1           8       1        21
*           Operator    21      1         1          22       1        23
loan        Variable    23      5         1          24       1        29
.           Operator    28      1         1          29       1        30
Rate          Member    29      4         1          30       1        34
/           Operator    34      1         1          35       1        36
100           Number    36      3         1          37       1        40
*           Operator    40      1         1          41       1        42
holding     Variable    42      8         1          43       1        51
+           Operator    51      1         1          52       1        53
termRates   Variable    53     10         1          54       1        64
[           Operator    63      1         1          64       1        65
loan        Variable    64      5         1          65       1        70
.           Operator    69      1         1          70       1        71
Term          Member    70      4         1          71       1        75
]           Operator    74      1         1          75       1        76
*           Operator    76      1         1          77       1        78
holding     Variable    78      8         1          79       1        87
…            NewLine    87      2         1          88       2         1

You could treat this array like a queue and determine as you dequeue what each token is doing to construct an expression for it. If you skip any Operator where the Content is a math one (+,-,*,/, etc.), you then have each accessible value that you want for that final output.

A function could then be used to create those values and add them to the loan object:

function Add-MemberProperties {
    param(
        $Variable,
        [string]$Name,
        [scriptblock]$Scriptblock
    )

    $errors = $null
    [System.Collections.Generic.Queue[PSObject]]$q = [System.Management.Automation.PSParser]::Tokenize(
        $scriptblock, [ref]$errors
        ) | ForEach-Object { $_ }

    $values = while ($q.Count) {
        $current = $q.Dequeue()
        switch ($current.type) {
            "variable" {
                if ($q.count) {
                    $next = $q.Peek()
                    if ($current.EndColumn -eq $next.StartColumn -and $next.content -eq ".") {
                        $next = $q.Dequeue()
                        $final = $q.Dequeue()
                        switch ($final.type) {
                            "string" { $ex = '${0}."{1}"' -f $current.content, $final.content }
                            default { $ex = '${0}.{1}' -f $current.content, $final.content }
                        }
                        Invoke-Expression $ex
                    }
                    elseif ($next.Content -eq "[") {
                        $next = $q.Dequeue()
                        $final = $q.Dequeue()
                        switch ($final.type) {
                            "string" { $ex = '${0}["{1}"]' -f $current.content, $final.content }
                            "variable" { $ex = '${0}[${1}]' -f $current.content, $final.content }
                            default { $ex = '${0}[{1}]' -f $current.content, $final.content }
                        }
                        $ex | write-host
                        Invoke-Expression $ex
                    }
                    else {
                        Get-Variable -ValueOnly $current.Content
                    }
                }
            }
        }
    }

    $Variable | Add-Member -NotePropertyName $Name -NotePropertyValue $ScriptBlock.Invoke()[0]
    $Variable | Add-Member -NotePropertyName "${Name}Formula" -NotePropertyValue $ScriptBlock
    $Variable | Add-Member -NotePropertyName "${Name}Values" -NotePropertyValue ($values -join "; ")
}

Add-MemberProperties -Variable $loan -Name Interest -ScriptBlock { $loan."Loan Amount" * $loan.Rate / 100 * $holding + $termRates[$loan.Term] * $holding }

This code doesn’t even work though. The nested property in the hashtable $termRates[$loan.Term] is too deep for the function to find. We’d need to change the scriptblock like so:

$term = $loan.term
Add-MemberProperties -Variable $loan -Name Interest -ScriptBlock { $loan."Loan Amount" * $loan.Rate / 100 * $holding + $termRates[$Term] * $holding }

That at least runs and the output of $loan now shows something like what we wanted:

Loan Amount     : 100
Rate            : 6.25
Term            : 60
Fico            : 725
Interest        : {24.75}
InterestFormula :  $loan."Loan Amount" * $loan.Rate / 100 * $holding + $termRates[$term] * $holding
InterestValues  : 100; 6.25; 3; 2; 3

The real problem is that this function is not recursive. What if the object $loan has some nested properties that were needed? Something like $loan.Originator.Fico. That would be a pain to manually overcome with $fico = $loan.Originator.Fico. The function needs reworked to be recursive if we’re going to use it. It’s not very clear what the easiest way to do that is for me. Lets go another way.

Splitting Scriptblocks Where It Counts

The original design goal was to get the value of each part of the expression between the math operators. What if I turned the 1 scriptblock into several smaller ones and then invoked each one individually? Then I don’t need to deal with all the recursion in the tokens!

The important part would be to still using tokenization to find operators. If I just did a split on any arithmetic operator like this:

$sb = { $loan."Loan Amount" * $loan.Rate / 100 * $holding + $termRates[$loan.Term] * $holding }
$sb -split "[-+*/%]"

It would look fine:

$loan."Loan Amount"
$loan.Rate
100
$holding
$termRates[$loan.Term]
$holding

Until we hit something like changing the “Loan Amount” property to “Loan-Amount”:

$sb = { $loan."Loan-Amount" * $loan.Rate / 100 * $holding + $termRates[$loan.Term] * $holding }
$sb -split "[-+*/%]"

Result:

$loan."Loan
Amount"
$loan.Rate
100
$holding
$termRates[$loan.Term]
$holding

We can make use of the tokenizer like so:

$sb = { $loan."Loan Amount" * $loan.Rate / 100 * $holding + $termRates[$loan.Term] * $holding }
$errors = $null
$tokens = [System.Management.Automation.PSParser]::Tokenize($sb, [ref]$errors)
$splits = $tokens | Where-Object Type -EQ "Operator" | Where-Object Content -Match "[-+*/%]"
$splits | Format-Table

# Content     Type Start Length StartLine StartColumn EndLine EndColumn
# -------     ---- ----- ------ --------- ----------- ------- ---------
# *       Operator    21      1         1          22       1        23
# /       Operator    34      1         1          35       1        36
# *       Operator    40      1         1          41       1        42
# +       Operator    51      1         1          52       1        53
# *       Operator    76      1         1          77       1        78

And then use the start column as the portion of the string to split at.

function Add-MemberProperties {
    param(
        $Variable,
        [string]$Name,
        [scriptblock]$Scriptblock
    )
    [string]$s = $Scriptblock

    $errors = $null
    $tokens = [System.Management.Automation.PSParser]::Tokenize(
        $scriptblock, [ref]$errors
        ) | ForEach-Object { $_ }

    $array = @()
    $position = 0
    $tokens |
    Where-Object Type -EQ "Operator" |
    Where-Object Content -Match "[-+*/%]" |
    ForEach-Object {
        $array += $s.Substring($position, $_.StartColumn - $position - 1)
        $position = $_.EndColumn
    }
    $array += $s.Substring($position, $s.length - $position)
    $values = $array | ForEach-Object {
        "{0}={1}" -f $_.Trim(), (Invoke-Expression $_)
    }

    $Variable | Add-Member -NotePropertyName $Name -NotePropertyValue $ScriptBlock.Invoke()[0]
    $Variable | Add-Member -NotePropertyName "${Name}Formula" -NotePropertyValue $ScriptBlock
    $Variable | Add-Member -NotePropertyName "${Name}Values" -NotePropertyValue ($Values -join "; ")
}

Add-MemberProperties -Variable $loan -Name Interest -ScriptBlock { $loan."Loan Amount" * $loan.Rate / 100 * $holding + $termRates[$loan.Term] * $holding }

# Output of $loan:
# Loan Amount     : 100
# Rate            : 6.25
# Term            : 60
# Fico            : 725
# Interest        : 24.75
# InterestFormula :  $loan."Loan Amount" * $loan.Rate / 100 * $holding + $termRates[$loan.Term] * $holding
# InterestValues  : $loan."Loan Amount"=100; $loan.Rate=6.25; 100=100; $holding=3; $termRates[$loan.Term]=2; $holding=3

Using this approach, it now becomes trivial to add as complex of expressions as I want, without worry!

Add-MemberProperties -Variable $loan -Name FinalAmount -ScriptBlock {$loan."Loan Amount" + $loan.Interest}

Final output of $loan:

Loan Amount        : 100
Rate               : 6.25
Term               : 60
Fico               : 725
Interest           : 24.75
InterestFormula    :  $loan."Loan Amount" * $loan.Rate / 100 * $holding + $termRates[$loan.Term] * $holding
InterestValues     : $loan."Loan Amount"=100; $loan.Rate=6.25; 100=100; $holding=3; $termRates[$loan.Term]=2; $holding=3
FinalAmount        : 124.75
FinalAmountFormula : $loan."Loan Amount" + $loan.Interest
FinalAmountValues  : $loan."Loan Amount"=100; $loan.Interest=24.75

Wrapping Up

As you can see, the need to log what you did, what the input values were, and the output is totally doable. And you can avoid having to manually duplicate all the code if you make use of PowerShells AST to find the important bits for you.

Was this helpful? Or just random trivia that you’ll literally never use? Discuss on Reddit