Home > Uncategorized > Grasp, A .NET Analysis Engine – Part 6: Validating Calculations

Grasp, A .NET Analysis Engine – Part 6: Validating Calculations

In part 5, we saw how to create runtime instances by providing an initial set of values to an executable. In this post, we will look at the first step in creating an executable from a schema: validating that its calculations are semantically correct.

Foundation

The GraspCompiler class is the context in which validation and compilation takes place. Based on what we saw in part 5, we would expect it to look like this:

internal sealed class GraspCompiler
{
  private readonly GraspSchema _schema;

  internal GraspCompiler(GraspSchema schema)
  {
    _schema = schema;
  }

  internal GraspExecutable Compile()
  {
    // Not quite yet…
  }
}

It is internal because we don’t want to expose it as part of the public API (we do that through the GraspSchema.Compile method). It is sealed because Grasp has a single definition of compilation and is not intended for extension. It is an implementation detail. If in the future we decide it should be a base class, that decision will be easier because we did not expose it publicly. This is true of all classes involved in the compilation process.

The Compile method is blank for now. This is the basic skeleton, but before we flesh it out we need to lay some groundwork. Specifically, to we compile a set of calculations, we must be able to compile a single calculation.

Variable References

The defining characteristic of a calculation expression is that it contains nodes which are instances of the VariableExpression class (defined in part 2). Expression trees have no idea what these nodes mean; we grafted them on to represent a concept that only Grasp understands. This means we are going to need to do something with them before we can turn expressions into executable code.

In order to do meaningful work with the variables nodes, we first need to find them. Expression trees are complex beasts; they can describe any .NET expression you can dream up, which might be a massive number of nodes. How do we locate variables in all of that?

Luckily, .NET gives us the ExpressionVisitor class. Its job is to sift through all nodes in an expression and give us a chance to inspect them. If a node has child nodes, it will sift through those as well. For example, an operator references expressions for its left and right operands, and a method call may reference expressions for its arguments. The knowledge of how to visit each kind of node and its children is baked into the ExpressionVisitor base class; all we need to do is derive from it and override its methods.

To add support for variables, we can extend ExpressionVisitor with a base class that adds a single method for visiting VariableExpression nodes:

internal abstract class CalculationExpressionVisitor : ExpressionVisitor
{
  public override Expression Visit(Expression node)
  {
    return node.NodeType == VariableExpression.ExpressionType
      ? VisitVariable((VariableExpression) node)
      : base.Visit(node);
  }

  protected virtual Expression VisitVariable(VariableExpression node)
  {
    return node;
  }
}

We override the method which visits any given expression and check its node type; if it is a variable, we allow derived classes to inspect it via the VisitVariable method. Otherwise, we let the base class determine how to visit the node. This gives us a context in which we can process calculation expressions.

Finding Variables

The most basic use of CalculationExpressionVisitor is to find all of the variables referenced by a calculation:

internal sealed class VariableSearch : CalculationExpressionVisitor
{
  private ISet<Variable> _variables;

  internal ISet<Variable> GetVariables(Calculation calculation)
  {
    _variables = new HashSet<Variable>();

    Visit(calculation.Expression);

    return _variables;
  }

  protected override Expression VisitVariable(VariableExpression node)
  {
    _variables.Add(node.Variable);

    return node;
  }
}

We create an implementation which exposes a single method named GetVariables; it takes a calculation and returns all of the variables in its expression. It does this by passing the expression to the same Visit method we overrode in CalculationExpressionVisitor, then keeping track of every variable it encounters. This is an incredibly small amount of code to walk any arbitrary expression for a calculation; once again, thanks .NET!

Validating a Calculation

Now that we know how to determine all of the unique variables referenced by a calculation, we can put that information to use in validating its structure is correct:

  • All referenced variables must exist in the schema
  • The result must be assignable to the output variable
    In order to make these assessments, we first need to associate a calculation with all of its referenced variables. We can call this pairing the schema of a calculation:
internal sealed class CalculationSchema
{
  private readonly Calculation _calculation;

  internal CalculationSchema(Calculation calculation)
  {
    _calculation = calculation;

    Variables = new VariableSearch().GetVariables(calculation);
  }

  internal Expression Expression
  {
    get { return _calculation.Expression; }
  }

  internal Variable OutputVariable
  {
    get { return _calculation.OutputVariable; }
  }

  internal ISet<Variable> Variables { get; private set; }
}

We expose the existing elements of a calculation; we also use our VariableSearch visitor to find all of the referenced variables and expose them. This is the general usage pattern for a visitor: instantiate and use one whenever needed. A visitor encapsulates some algorithm, exposes a single entry point, and is most often used a single time.

With the ability to find all referenced variables, we can begin to flesh out the compiler. The first thing we do is create schemas for each of the calculations:

internal sealed class GraspCompiler
{
  private readonly GraspSchema _schema;
  private readonly ISet<Variable> _variables;
  private readonly IList<CalculationSchema> _calculations;

  internal GraspCompiler(GraspSchema schema)
  {
    _schema = schema;

    _calculations = schema.Calculations.Select(
      calculation => new CalculationSchema(calculation)).ToList();

    var effectiveVariables = schema.Variables.Concat(
      _calculations.Select(calculation => calculation.OutputVariable));

    _variables = new HashSet<Variable>(effectiveVariables);
  }

  internal GraspExecutable Compile()
  {
    // Not quite yet…
  }
}

We also determine the effective set of variables, which includes the variables in the schema as well as all of the calculations’ output variables. By automatically including the output variables, Grasp users don’t have to explicitly include them in the variables they pass to the schema.

Next, we validate all of the calculations before we continue the compilation process:

private void ValidateCalculations()
{
  foreach(var calculation in _calculations)
  {
    EnsureVariablesExistInSchema(calculation);

    EnsureAssignableToOutputVariable(calculation);
  }
}

Ensuring all of a calculation’s variables are part of the GraspSchema we are compiling is straightforward. The key here is that we created a HashSet<> in the constructor to hold its variables, increasing performance during repeated lookups:

private void EnsureVariablesExistInSchema(CalculationSchema calculation)
{
  foreach(var variable in calculation.Variables)
  {
    if(!_variables.Contains(variable))
    {
      throw new InvalidCalculationVariableException(calculation, variable);
    }
  }
}

We also ensure that the result of a calculation’s expression can be assigned to its output variable by using the Type.IsAssignableFrom method:

private void EnsureAssignableToOutputVariable(CalculationSchema calculation)
{
  var variableType = calculation.OutputVariable.Type;
  var resultType = calculation.Expression.Type;

  if(!variableType.IsAssignableFrom(resultType))
  {
    throw new InvalidCalculationResultTypeException(calculation);
  }
}

These conditions guards against the various calculation-related errors that might occur(expression trees take care of validating the structure of the code they represent). We throw custom exception types to facilitate better reporting to consumers of the API. This will also be very useful when we create a UI for building and compiling runtimes (more on that later).

Summary

We created the foundation of the compilation process, GraspCompiler. We also added some infrastructure for visiting variable nodes in expression trees and created a visitor which finds all variable references in a calculation. We then validated that each calculation’s structure is correct.

Next time, we will finally turn a calculation’s expression into executable code.

Continue to Part 7: Compiling Calculations

Tags: , ,