C# and Visual Basic include a nice feature called lambda expressions. With this feature you can define dynamic functions without having to know how the function will look like in the end. A common use case for this feature is parsing mathematical terms. In this post I’d like to introduce you to lambda expressions by implementing such a parser.

# Lambda Expressions

So let’s begin by having a closer look at lambda expressions. Normally, when you define a function, this looks somewhat like that:

double Plus(double a, double b) { return a + b; }

You then can call this function by its name providing the two parameters. The other way to call functions is using lambda expressions. Here’s what that will look like:

Func<double, double, double> Plus = (a, b) => a + b;

You can call this Function exactly the way you would call the first one. The left side of the “=” is not really the interesting part of that line. We just declare a variable “Plus” of type “Func”. This type is a function that takes two double parameters (the first two type parameter) and returns a double value (the last type parameter). More interesting is the right side. This is the actual definition of the function. It starts with a list of identifiers enclosed in parentheses followed by the keyword “=>”. At last there is the function’s body. So the function is a mapping that maps a tuple (a, b) to the value a + b. In the declaration of the variable we specified that a, b and the return value are doubles. The function is an anonymous function for it has no public name. Only a delegate to this function is saved in the variable “Plus”.

Of course you can define functions with a various number of arguments and different types. The function’s body could also be a whole code block enclosed in {…} But this option does not play a big role for our Math Parser, so it will not be explained in detail.

# Mathematical fundamentals of terms

Terms are just meaningful sequences of mathematical symbols. These symbols could be numbers, variables, operators or parentheses. Let’s consider the following term:

4 * sin(x + 1) + 3

So how must this term be evaluated? Firstly we must evaluate the sub term x+1. After that evaluate the sinus function, then multiply it by 4 and at last add 3. If we look at that the other way round, we can say that the term is an addition of a sub term and 3. The sub term is a multiplication of 4 and another sub term and so on. This can be visualized by an evaluation tree like this:

As you see there are operators of different cardinality. There are binary operators like the +, unary operators like the sin and operators that take no argument. These symbols like x are usually not called operators but they definitely are. I’ll call them constant operators.

The figure describes the parsing strategy pretty well: Take the operator with the least priority and construct the resulting function from this operator providing the parsed arguments. This can be done in a recursive way. So let’s see how this can be implemented.

# Implementation

A good idea would be to define the supported symbols. Therefore I created a common abstract class for every operator. Concrete operators do not derive from that class directly. There are abstract classes for constant, unary and binary operators. Each concrete operator defines the symbol it is represented by and the function that it means. Here’s an excerpt showing all relevant classes for the + operator:

abstract class Operator { protected string _Symbol; public string Symbol { get { return _Symbol; } } } abstract class BinaryOperator : Operator { public abstract Func<double, double> GetFunction(Func<double, double> LeftArgument, Func<double, double> RightArgument); } sealed class PlusOperator : BinaryOperator { public PlusOperator() { _Symbol = "+"; } public override Func<double, double> GetFunction(Func<double, double> LeftArgument, Func<double, double> RightArgument) { return x => LeftArgument(x) + RightArgument(x); } }

The most important method is the GetFunction(). This method takes the left and right argument as already parsed functions and combines them by adding them. So if the program will evaluate a term with a + in it, firstly the left argument will be evaluated with the argument x, then the right argument and at last the results will be added.

Furthermore the program will need the priorities of the operators. Therefore we use an array of all recognizable operators. The elements of the array are ordered by their priority:

static Operator[] RecognizableSymbols = { new XOperator(), new PlusOperator(), new MinusOperator(), new MultiplyOperator(), new DivideOperator(), new PowerOperator(), new SinOperator(), new CosOperator(), new SqrtOperator() };

## Extracting the operators

As mentioned earlier terms are sequences of operators. In order to evaulate a term, we first have to extract the operators from the String representation of the term. After extracting the operators, parsing the term is much easier. So the main parse function is very short:

public static Func<double, double> ParseTerm(string term) { var Components = SplitTerm(term); return ParseComponents(Components); }

The term is split into its components (i.e. operators) and those components are delegated to a function that parses just the operators. Let’s have a look at how splitting the term into operators works.

static List<Operator> SplitTerm(string term) { List<Operator> result = new List<Operator>(); int index = 0; while (index < term.Length) { bool found = false; foreach (var item in RecognizableSymbols) { if (item.Symbol.Length > 0 && index + item.Symbol.Length <= term.Length && term.Substring(index, item.Symbol.Length).Equals(item.Symbol)) { result.Add(item); index += item.Symbol.Length; found = true; break; } } if (found) continue; if (term.Substring(index, 1).Equals(" ")) { index++; continue; } } }

This is just the basic function. It will be extended later. But let’s examine it first. The variable “index” defines the position of the next char to be examined. As long as this position is within the term, the char will be examined. Otherwise the gathered list of operators is returned. The foreach statement iterates over all recognizable operators and checks if there is an operator at the current position. If so, the operator is added to the temporary list and the index is advanced. The while statement is continued at the next position. Whitespaces are ignored if there are any in the provided term. If no operator is found at the current position then an exception is thrown. Let’s tweak this routine a bit. An elementary part of terms are parentheses. I decided to handle terms enclosed in parentheses as separate operators. Therefore I created the ParenthesisOperator class. This is a constant operator and takes the function that it stands for in its constructor. If there is an opening parenthesis at the current position, the according closing one is searched and the parenthesis operator is added to the list:

if (term.Substring(index, 1).Equals("(")) { int Parenthesis = 1; int index2 = index; while (Parenthesis > 0) { index2++; if (index2 >= term.Length) throw new ArgumentException("Provided term has an invalid format"); if (term.Substring(index2, 1).Equals("(")) Parenthesis++; else if (term.Substring(index2, 1).Equals(")")) Parenthesis--; } result.Add(new ParenthesisTerm(ParseTerm(term.Substring(index + 1, index2 - index - 1)))); index = index2 + 1; continue; if (term.Substring(index, 1).Equals(")")) throw new ArgumentException("Provided term has an invalid format"); }

The variable Parenthesis holds the number of open parentheses. We cannot just take the next closing parenthesis because there could be a parenthesis term in the parenthesis term. Therefore we search the appropriate closing one with the Parenthesis variable. The function of the parenthesis operator is evaluated by an additional call to ParseTerm providing the term enclosed.

The last thing to do is to recognize numbers in the term. If the char at the current position is a number, we try to increase the size of the number as much as possible. When the resulting string cannot be interpreted as a number any more, the size of the number is decremented once and the according number is added to the list:

double number; if (Double.TryParse(term.Substring(index, 1), out number)) //Next char is a number { int length = 2; while ((index + length <= term.Length && Double.TryParse(term.Substring(index, length), out number))) length++; length--; Double.TryParse(term.Substring(index, length), out number); result.Add(new NumericOperator(number)); index += length; LastSymbolWasNumeric = true; continue; }

That’s basically the splitting method. Only on last tweak. Sometimes it is possible to leave the * operator away. For example in the term 2x which should be interpreted as 2*x. This can be achieved by adding the MultiplyOperator when necessary. Therefore we save if there was a number at the last position in the variable LastSymbolWasNumeric. The * can be left out before parenthesis terms or before constant or unary operators. So we’ll add some code to insert the missing operator if necessary:

if (LastSymbolWasNumeric && !(result.Last() is BinaryOperator)) { result.Insert(result.Count - 1, RecognizableSymbols.Where(o => o.Symbol.Equals("*")).First()); } LastSymbolWasNumeric = false;

## Parsing the components

We’re almost done. Everything that is left is the ParseComponents() method. For we implemented the functionality of the operators in their classes, this method will not be very big. The first thing is to check, if there is only one component provided. If so, we can just return the component’s function:

static Func<double, double> ParseComponents(List Components) { //If term is only a single symbol it must be a constant operator if (Components.Count == 1) { if (Components[0] is ConstantOperator) return (Components[0] as ConstantOperator).GetFunction(); else throw new ArgumentException("The provided term has an invalid format"); } }

If there is more than one component we need to pick the one with the least priority, get the necessary arguments for that operator and return its function:

//Term is a composite //search the symbol with the least priority foreach (var symbol in RecognizableSymbols) { if(!(symbol is ConstantOperator) && Components.Contains(symbol)) { int n = Components.LastIndexOf(symbol); var LeftPart = new List<Operator>(); var RightPart = new List<Operator>(); for (int i = 0; i < n; ++i) LeftPart.Add(Components[i]); for (int i = n + 1; i < Components.Count; ++i) RightPart.Add(Components[i]); if (symbol is MinusOperator && LeftPart.Count == 0) LeftPart.Add(new NumericOperator(0)); if (symbol is UnaryOperator) return (symbol as UnaryOperator).GetFunction(ParseComponents(RightPart)); if (symbol is BinaryOperator) return (symbol as BinaryOperator).GetFunction(ParseComponents(LeftPart), ParseComponents(RightPart)); } } throw new ArgumentException("Provided term has an invalid format");

This snippet contains a tweak that allows the – without a left argument. If so it is a sign and a 0 is inserted before it.

## Sample project

That’s all. With lambda expressions it is really easy to implement a math parser. In the sample project I included a small WPF application that plots a diagram of a term you can insert. Just type the term, hit return and the diagram is plotted. The x value goes from -5 through 5. The y value is adjusted according to min and max value.

Here you can download the complete project: Download Zip