Monday 19 September 2011

Calling Lisp with Mathlink

My final example of using MathLink is a tutorial on calling Lisp code, from Mathematica.  The implementation I will show is probably quite crude to a computer scientist.  However to someone interested in producing a solution I think it is useful.  The method I put together is sufficiently general as to allow Mathematica to call any executable file.  This method is useful for calling code written in a language that is not easy to integrate in C.

One language that is not very easy to call in C is Lisp.  First of all let me be clear, it is possible to use a foreign function interface in C to call Lisp.  However it is very difficult.  Limiting myself to the open source Lisp compilers I found no easy option.  I did find the following quote from the Steel Bank Common Lisp documentation,

"Calling Lisp functions from C is sometimes possible, but is extremely hackish and poorly supported"

Thus I was searching for an easy way that gave me results.  I also felt that if one were to work out how to write some particular program in Lisp, that C could call, then the problem would not be completely solved.  More complicated data structures, or any modest change to the Lisp program, could require more hacking to make the C calls work okay.  My method, whilst less elegant and a little slower, does not suffer from these problems.

First of all I will run through an example program, written in Lisp, that can be called by C as a system call.  Then, using MathLink, I will show how to make Mathematica call the C code and return the output from the Lisp program to the Mathematica program.  I wish to emphasise that this method is not in any way particular to Lisp, it will work for any executable file.  After I have given an example I will briefly discuss the efficiency of the program.  I will attempt to argue the solution I propose is quick enough for many applications.

Below is a some Lisp code for a factorial function,

(defun factorial (x)
  (if (eql x 1) 1 (* x (factorial (1- x)))))

(defun main (argv)
  (with-open-file (s "$Path/output.dat" :direction :output :if-exists :supersede)
          (format s "~A" (factorial (read-from-string (first (rest argv)))))))

This code performs two basic tasks.  Firstly it takes the argument supplied at runtime, called argv, and, secondly, writes the factorial of argv to the file located at $Path/output.dat.  I will not explain the complete details of the Lisp code.  What is important is that the code computes functions of variables supplied at run time and writes the result(s) to a file.  If such an arrangement can be realised in any language then this tutorial shows how to run such code from Mathematica.  One important note is the location of the output file, here called output.dat.  When using Mathematica the default path will be changed.  Therefore it is a good idea to specify all paths in full in all files/code.  This will avoid any bugs occurring when trying to run code from Mathematica that causes files not to be found.

Now I will take a short detour to explain how to obtain executable files from Lisp.  If you are not interested in Lisp then please skip on to the C code.  My solution to creating Lisp executables was to use the program BuildApp.  You can obtain BuildApp from their website, here.  There are other options available to create executable files from Lisp.  After installing BuildApp you need to type in to a terminal,

terminal $ BuildApp --entry main --output factorial --load factorial.lisp

What this does is to compile the file factorial.lisp into an executable called factorial.  The flag --entry main is a technial point that tells the program how the executable will be ran, here it is as a top-level function.  The effect of doing this is that, using the Lisp code given, one can now type in to a terminal,

terminal $ ./factorial 5

and the output, of 120, will be written to a file.  This is the first goal for any language, Lisp or otherwise (that can not be embedded into C): producing an executable that can receive arguments from a terminal, as shown above.  How to do this depends on the compiler and language used.

Next some C code is required that MathLink can talk to.  From previous blogs we know that what is required is to write a C function that takes the arguments and returns the relevant output.  I wrote the code,

#include <stdio.h>
#include <stdlib.h>
#include "mathlink.h"

int f(int x) {
    FILE *fp;

    char output[100], fstr[100];
    int i;

    sprintf(fstr, "$Path/factorial %d", x);
    system(fstr);

    fp = fopen("$Path/output.dat", "r");
   
    fscanf(fp, "%d", &i);
    fclose(fp);

    return i;
}


int main(int argc, char *argv[]) {
    return MLMain(argc, argv);
}

As before you should recognise the MathLink program code at the very bottom and the mathlink.h header file.  In the function f an integer x is converted in to a character.  This is concatenated to the path of the executable by sprintf(fstr, "$Path/factorial %d", x);.  In this case the executable is called factorial.  This is echoed to the terminal by the syntax system(fstr); and fstr is the string that the terminal will (try to) execute.  In short, the C code writes to the terminal,

terminal $ $Path/factorial x

where $Path is the complete path to the executable factorial and x is the variable whose factorial we are computing.  So now in the output file is the resulting factorial of x.  The C function, f, now reads the file output.dat and returns the number it finds.  I admit this is a crude implementation but its simplicity means that it does not matter how the executable was written.  It could be a terminal script, or anything.  This means that the solution I have written really is very powerful.

There is now the formality of writing a template and compiling the template and C code together.  Here we can use the script mcc.  The template for the function f is,

:Begin:
:Function:      f
:Pattern:       f[x_Integer]
:Arguments:     {x}
:ArgumentTypes: {Integer}
:ReturnType:    Integer
:End:

Saving this file as f.tm it is just a task of using mcc.  In to a terminal type,

terminal $ mcc f.tm f.c -o result

Installing this to Mathematica gives the ability to call the Lisp program.

I'd like to close with some general comments on efficiency and the power of this solution.  Firstly the solution is crude and slow.  However the speed is lost by calling the executable then writing and reading to file.  These tasks are only performed once.  The cost will only be a small fraction of a second.  So in a program that takes a minute to run, for example, one would not mind wasting a quarter of a second to be able to code in any language whatsoever.  Such freedom might well allow one to use a program that is many hours faster than an equivalent Mathematica program.  Or perhaps the ability to code in some particular language, as compared to C, might offer a solution to the problem in days rather than weeks.  As a result I do not consider the efficiency loss as an issue for most situations.  I also consider the code I have given to be extremely flexible and powerful to a range of users.

Friday 16 September 2011

Calling FORTRAN Code with MathLink

This post will explain how to use MathLink to call FORTRAN code.  I forsee three reasons this may be interesting.  One, people still use FORTRAN code!  Two, it will demonstrate how to interface Mathematica, through MathLink, with any programming language that C can call.  Thirdly, the blog will explain more about how MathLink works and will allow users to accomplish more sophisticated implementations.  I will rely on ideas developed in previous blog entries.

Here I will outline calling a simple program, adding two numbers together.  A FORTRAN (95) subroutine that achieves this is,

subroutine addtwof(i,j,k)
implicit none
integer::i,j,k
k=i+j
end

This subroutine is saved in a text file, which I have called addtwof.f95.  It is possible to write C code that calls this subroutine.  Notice the FORTRAN code should not be a complete program, but needs to be a subroutine.  The C code to call a FORTRAN subroutine is compiler dependent.  However there are two common syntax.  Using the GCC compiler the syntax is,

addtwof_(&i, &j, &result);

Omitting the underscore, so just addtwof(&i, &j, &result);, is the other common syntax.  Information on what to use should be available in your C compilers documentation.  One technical difference between the C and FORTRAN languages is the use of pointers.  Basically C and FORTRAN have different data structures, a variable x, in C is a pointer.  It points to a location in the memory whereas in FORTRAN a variable is directly the location of that value in the memory.  As a result the ampersand (&) is used to tell C send FORTRAN the number itself, not just a memory location!  As a result you will see &i not just i etc.  There is a second important difference.  FORTRAN is not a case-sensitive language whilst C is.  Be very careful about using mixed case names, better yet, just don't do it and only use lower case letters for example.  The important lesson here is that calling a language from another requires to think carefully about the differences in the languages and carefully tread around them.

Now it is possible to write a simple C program that is usable by MathLink.  The code I will use is,

#include "mathlink.h"

int f(int i, int j) {
    int result;
    addtwof_(&i, &j, &result);
    return result;
}

int main(int argc, char *argv[]) {
    return MLMain(argc, argv);
}

From my previous blog you should recognise the mathlink header file and the MLMain call at the bottom.  In the function f one can see the FORTRAN call, complete with ampersands. 

In general, if a language can be called by C, and you can make it all happen, then you will just need an analogous program to the one here and MathLink will work.  Whilst the example here is dealing with FORTRAN there are many other languages that C call call directly.

There is one more file that is needed, a template file.  As the program adds two integers the template is the same as the C only implementation.  For completeness here it is again,

:Begin:
:Function:      f
:Pattern:       f[x_Integer, y_Integer]
:Arguments:     {x, y}
:ArgumentTypes: {Integer, Integer}
:ReturnType:    Integer
:End:

I have called this file f.tm.  (Again, the .tm extension is important.)

At this point the three files, addtwo.f95, f. tm and the C program, f.c, must be compiled and linked.  Previously the program mcc was used.  It is not possible to use mcc here because we need to link the FORTRAN code to the C.

Linking FORTRAN with C, or any language with C for that matter, is done through object files.  All the files in a project are converted to object files, with a compiler, and are usually given .o extensions.  I will follow this convention.  A C compiler then links all the individual object files together into one executable.  Firstly I will outline how to generate the object files, I will use the GNU compilers.

In the case of the FORTRAN subroutine, type into a terminal,

terminal $ gfortran -c addtwof.f95 -o addtwof.o

This will compile the FORTRAN code in to an object file.  It is the flag -c which ensures this.  I have opted to name the output addtwof.o.  Next the C program must be converted to an object file.  Type in to a terminal,

terminal $ gcc -c f.c -o f.o

All of the same comments for the gfortran call apply.  The -c flag provides an object file, here named f.o.  Now the tricky object file, the one from the template file.  To do this requires the Mathematica program mprep.  This will convert the .tm template file into C code.  The C code is passed to gcc to give an object file and we have three object files.

I found the mprep program in the SystemFiles directory of my Mathematica installation.  Specifically it is in .../SystemFiles/Links/MathLink/Developerkit/architecture/CompilerAdditions directory, where architecture is either the 32-bit of 64-bit sub-directory, depending on your installation and OS.  In the folder you will find several useful things.  Firstly the program mprep, the script mcc (I have referred to it as a program before but it actually is a script) and some libraries.


To use mprep you have several options.  One is to type in to a terminal the location of mprep and the location of the template file.  This would look something like,

terminal $ /usr/local/Wolfram/.../CompilerAdditions/mprep /home/user/.../f.tm -o f.tm.c

which would convert the .tm template file in to a f.tm.c C code file.  Or, if this is too unweildy, copy mprep and the libraries into the location where the template file is.  Then one can just type in to a terminal,

terminal $ ./mprep f.tm -o f.tm.c

Finally you could copy the template file to the mprep directory but that could get messy if many MathLink projects are undertaken.  Now we can use gcc to generate an object file from the template C code, f.tm.c.  This is done like before,

terminal $ gcc -c f.tm.c -o f.tm.o

Now we have three objects files, one from the FORTRAN code, one from the C code and one from the template file.  To link them all together use gcc again but with a specific set of libraries.  Into a terminal type,


terminal $ gcc f.tm.o f.o addtwof.o libML64i3.a -lm -lpthread -lrt -lstdc++ -o program

This will output an executable, here called program.  The library calls are quite complicated.  Of the 5 calls, 4 are normal enough and all begin with -l.  The final one is a Mathematica one, there are two libraries.  The above code calls the 64-bit version, there is a 32-bit version called libML32o3.a.  Both of which are in the same location as the mprep file.  The code I have given assumes the libraries have been moved to the directory where all the object files are.  You will need to tweak the terminal commands or copy files around, as is appropriate.

The above procedure is the details of the mcc script.  Passing the template file and C code to the mcc script causes several things.  Firstly mcc works out where the mprep program and all the libraries it needs are, which architecture it is running on and what C compile it has to use.  Then it starts mprep and the C compiler as we have done.  In fact, if you can not find certain files, you can open the mcc script (it is easier to find because it is also in a file with the other Mathematica executables) and read where files are and generally how mcc works.

If you have got this far then it is easy to use your executable in Mathematica with the Install command as before.

By studying this tutorial and changing the relevant code it should be simple to have Mathematica call a C program that calls a very large range of languages.  It should be possible to produce very powerful tools using this method.

Thursday 15 September 2011

A Simple Example of MathLink

This post shows how to use MathLink in the simplest scenario.  This post will show how to make Mathematica run a C program that takes the two inputs and returns the sum, all using MathLink.
A MathLink project consists of a minimum of two files.  One is the C program itself and the second is a template file that tells Mathematica how to interact with the C program.  To join the files together there is a program called mprep, supplied by Mathematica, which converts the template to C code.  In this post I will not use mprep but instead use the program mcc which makes use of mprep and is simpler.  An explanation of how to use mprep will be given in a post about calling FORTRAN with MathLink.


Now let us begin with the code.  Begin by writing a C function that solves the required problem.  Here I will tackle the challenge of adding two integers, x and y.  An appropriate C function is,

int f(int x, int y) {
    return x*y;
}

To produce a complete C program there are two further components, both of which are standard for any MathLink exercise.  One is a header file "mathlink.h" and the second is the main body of the program which is supplied by Mathematica.  It is,

int main(int argc, char *argv[]) {
    return MLMain(argc, argv);
}

Putting all three components together gives the complete C program,

#include "mathlink.h"

int f(int x, int y) {
    return x+y;
}

int main(int argc, char *argv[]) {
    return MLMain(argc, argv);
}

Next for the template file.  There is a guide to writing the template file in the Mathematica documentation.  Here is the template guide, the link also runs through the example solved here.  The template file is quite self-explanatory.  It explains that the function is called f, requires two integer inputs, x and y, and will return and integer.  Here it is,

:Begin:
:Function:      f
:Pattern:       f[x_Integer, y_Integer]
:Arguments:     {x, y}
:ArgumentTypes: {Integer, Integer}
:ReturnType:    Integer
:End:

Place this code in to a file which must have a .tm extension.  I will be using f.tm.  My C program will be called f.c.  To compile the code, open a terminal and type,

terminal $ mcc f.c f.tm -o output

The -o flag is optional and just controls the name of the resulting executable file as usual.

It is now possible to use the function f in Mathematica.  After opening Mathematica, one needs the command Install[argument] where the argument is the path to the executable, here called output, that has been generated.  After the command is executed Mathematica should be able to use f.  To try it one can type f[2,3] and Mathematica should reply 5!

What MathLink does is to give a library that contains the C command MLMain(argc, argv*).  The argument argc refers to the name of the function.  Therefore, the name of the function in the C code and the name in the template file must match.  They will then match the name of the function in Mathematica.  (So one can name the C code file and the template file as one wishes.)  When the function is installed in Mathematica, and then called, Mathematica passes the string that names the function in the argument argc and the arguments of the function in argv*.  All of this is unraveled in the C program to call the corresponding function.

Adventures in MathLink

Recently I became interested in MathLink, a feature in Mathematica allowing for external calls to C programs.  There is also a J/Link project allowing Java to interface with Mathematica.

In simple terms, MathLink allows Mathematica to run pass some arguments to a C program, run the program and return the results.  An example use of such a feature might be a numerical project.  Suppose one is interested in a numerical problem and would like to develop tools to perform simulations.  Typically Mathematica is slow at large scale numerical work, or, perhaps it is that a particular C library would be very useful.  Either way, one decides C code would be useful.  Using MathLink one could develop a toolkit of functions combining the efficiency of C code with the analytic tools and user interface of Mathematica.  Clearly this would be very useful to some people.

It is also possible to use MathLink in reverse, that is to run Mathematica as a subroutine inside a C program.  I did not explore this option.

To learn about MathLink I set myself three projects.  Firstly writing a simple proof-of-concept program in C that Mathematica could call.  Secondly writing a C program that itself calls a FORTRAN program.  Mathematica would then be able to call the C program, which would use a FORTRAN program to perform a computation, and return the results to Mathematica.  This may be useful to remaining FORTRAN users and it is a simple language for C code to call.  Thirdly, and most ambitiously, I wanted to produce a MathLink solution that allowed Mathematica to call Lisp code, again through a C program.

Each of the three tasks will make up further blog entries.