People tracking along will know that I am in the process of writing an LLVM backend for Moscow ML. There are several ways in which one can do this, so one has to be decided upon and tried out. It is however clear we need a way to manipulate and output LLVMs IR, so I set upon writing an set of bindings.
These bindings are not using an FFI to call into the C++ LLVM code. Rather they produce the IR assembly from an internal SML-like data structure. This approach has the disadvantage that it is more work to maintain. On the other hand, the OCaml bindings passes around llvalues all over the place so very few bugs will be captured at compile time (we could do oh so much better had we had dependent types).
I have written LLVM as a datatype in SML and then written about 40 percent of a type checker for it. The process is very simple as soon as one has converted LLVM rules into SML datatypes. Here is a part of the Type module
datatype t =
T_Integer of int
| T_Real of float_ty
| T_Fun of {return: t, params: t list}
| T_Struct of {elements: t list, packed: bool}
| T_Array of {ty: t, length: int}
| T_Pointer of t
| T_Vector of {ty: t, length: int}
| T_Opaque
| T_Void
| T_Label
| T_Top
which is a straightforward adaptation of the Language Reference of LLVM. I could simplify it more, but at the moment there is little need for that.
Type checking is carried out by a set of combinators. They probably form a monad (Haskell guys can relate this to a variant of the Either monad with a Writer on it as well. It should be straightforward to write it down as one or use a transformer stack to achieve the same. I have not given it much thought though).
The checking basis is the datatype
datatype t_check = Ok | Wrong of string list
Immediately, a helper is introduced
fun type_fail str = Wrong [str]
And a function for running a checker
fun run checker_fun ty =
case checker_fun ty of
Ok => ()
| Wrong errors =>
<<Print out type checking errors>>
Note that run will have type (ty -> t_check) -> ty -> unit. The question is then how to produce type checkers. This is done with combinators but of course:
(* Logical or for types, consider monadizing *)
fun or checker1 checker2 ty =
case checker1 ty of
Ok => Ok
| Wrong err1 => case checker2 ty of
Ok => Ok
| Wrong err2 => Wrong (List.concat [err1, err2])
The 'or' checker tries the first checker and if it fail it tries the second, picking up any errors on the way. It is also easy to assert that a given type must be the base type of a vector type:
fun vectorized checker1 vty =
case vty of
T_Vector {ty, ...} => checker1 ty
| _ => type_fail "Checked type is not a vector type"
Perhaps one could be giving a better error message here by capturing the eventual error output from checker1, but this has not been done yet.
Primitive checks are just doint what they are supposed to do. The assertion for an integer looks like the following:
fun assert_int ty =
case ty of
T_Integer _ => Ok
| _ => type_fail "Type is not of integer type."
Usage
An addition in LLVM is either an integer, a float or a vector of these. Aha:
val assert_int_float = or assert_int assert_float
val assert_add = or assert_int_float (vectorized assert_int_float)
Hopefully this shows the succinctness of the approach.
EDIT: Layout was bad. Fixed
