The topics below are possible student projects for a coursework masters (MIT program) or MRes degree. Contact the relevant academic (Anthony Sloane, Matt Roberts or Dominic Verity as noted) if you would like to discuss the possibility of undertaking one of these projects. Also contact us if you are interested in a programming languages project but do not see one here that is of interest.
Kiama and Scala-related projects
Kiama is a Scala-based library for software language processing that has been under development at Macquarie since 2008. The aim of Kiama is to make it easier to build software such as compilers, interpreters, static analysis tools, or other software that processes structured information.
These projects would suit students with an interest in languages and programming. Good skills in object-oriented programming and background in language implementation (for example, from COMP332 Programming Languages) will be useful. Scala and Kiama are the implementation platform used in COMP332.
Abstract State Machines (Sloane)
Abstract State Machines (http://www.eecs.umich.edu/gasm/) are a formal method for specifying computation and have been applied to a wide range of problem areas. From our perspective one of the most interesting applications is to the specification of the Java language and the Java virtual machine, published in a landmark 2001 book by Stark, Schmid, and Borger (http://www.eecs.umich.edu/gasm/papers/jbook.html). This project will use a Scala embedding of abstract state machines that we have developed for Kiama to encode all or some of the Java/JVM specification.
Bottom-up Rewriting (Sloane)
There is a quite a bit of theory and some nice tools built around the idea of bottom-up rewriting of trees (sometimes called tree parsing). The idea is to match patterns against a tree in order to "cover" it in the least possible way (relative to some cost measure). One application of this is to select assembly language instructions in a compiler, where the tree is a representation of an expression and the patterns encode the instructions that are available. The cost measure in this case might by the estimated execution time of the chosen instructions. This project will develop and evaluate an extension to the Kiama library that supports bottom-up rewriting.
Concrete object syntax for Scala (Sloane)
When processing abstract syntax trees in Kiama, we must use standard Scala notations for building trees and pattern matching against them to extract components. While the Scala notations are quite terse and convenient, for some applications it is desirable to use concrete syntax to express these trees. For example, instead of saying
Add (Var ("v"), Num (1)) to express an integer addition node in the tree, we might prefer to say something like
v + 1. This kind of notation is called concrete object syntax because it allows us to use the concrete syntax for some language (in this case, the language of arithmetic expressions) to express objects in another language (in this case, Scala). The brackets
... are used in this case to indicate that the code enclosed is in the expression language. Alternative notations are possible. This project will develop an approach and implementation for concrete object syntax support for Scala in Kiama and evaluate it using typical Kiama examples.Most likely the best approach will involve new features of Scala including macros and string interpolation.
Declarative Debugging with Kiama Profiling (Roberts)
Debugging declarative computations (such as pure functional, attribute grammars or term rewriting) is difficult because evaluation order is implicit. It can be difficult, or even impossible, for the developer to determine the actual flow of information in a particular execution. Kiama has sophisticated profiling capabilities which can be used to show the execution flow. In this project, you would expand the capabilities of Kiama's profiling to include
- animations of run-time behaviour
- break-point debugging of run-time behaviour
- different types of static representations of run time behaviour.
These tools will greatly enhance the ability of term-rewriters and attribute grammar authors to debug their code, and would be best-of-breed debugging tools for declarative representations of computation.
Execution Monitoring for Kiama
Execution monitoring is a general term for processes that watch the execution of a program and report about that execution. Included in the term are debugging and profiling. This project will design and build a modern execution monitoring system for the Kiama library. The aim would be to present information about the execution of Kiama-based programs in domain-specific terms, i.e., at the level of the specifications such as grammars and tree structure that a Kiama programmer uses, rather than at the level of basic Scala data and code.
We have previous support with building this sort of system for the Eli language processing system in the Noosa tool. Initially the plan would be to analyse that previous system and decide which ideas can transfer to the Kiama setting. Then we would extend the ideas as necessary to develop a full spectrum tool. Since we use Kiama in our undergraduate unit COMP332 Programming Languages we can use that unit as a testbed for the ideas developed by this project.
Indentation Sensitive Packrat Parsers (Roberts)
Michael Adams has shown how to better integrate indentation sensitivity into a range of parsers. His latest work shows how to do this for PEG parsers. However, this only applies to PEG parsers without memorisation. In this project the student will seek to extend Adams' work to include such "packrat" parsers. The student will implement their work in the Kiama language processing library.
Polymorphic Recursion and Higher Ranked types for Generic Programming (Roberts)
We believe that polymorphic recursion and higher ranked types are necessary to achieve generic programming in statically typed functional programming languages, but no one has conclusively shown this to be so. This project would survey existing approached to generic programming and attempt to discern where both these features are used. If we get a 100% success rate for the chosen approaches we will have created some evidence of our hypothesis.
Scala-based dependently-typed programming language (Sloane)
Interest in so-called dependently-typed programming languages is rising (see also projects under the Software Verification heading). These languages allow many program constraints to be expressed in powerful types so that programs can be statically verified to obey them.
At the moment dependent typing is still a niche concept but languages that support it (e.g., Agda and Idris) are gaining increasing exposure. Dependent typing is also at the core of many proof assistants including Coq and Isabelle. To date, the majority of dependently-typed languages have had a functional programming background and their syntax and semantics is often based on an existing functional language such as Haskell or ML. This project will investigate bringing these ideas more into the mainstream by developing a dependently-typed language based on the ideas of the popular Scala programming language. At least aspects of the Scala syntax would be adopted but each of the main semantic features of Scala would be considered for use in the new language where appropriate. Of particular interest would be an investigation of the utility of an object-oriented view of data as seen in Scala for a dependently-typed system.
Software Development Practices and Tools
Code formatting via Machine Learning: Experiences in Industry (Roberts)
Previous work has claimed that decent code formatters can be generated by machine learning from a corpus of examples. This project will test this assertion by applying the published technique to one or more large industrial code-bases. Full Details...
Lightweight Integrated Development Environments (Sloane)
Integrated development environments (IDEs) are widely used in software development. Unfortunately, it is non-trivial to extend an IDE to add support for new languages, analyses, or code generators. Usually such extensions can only be done by large groups with significant resources or with special access to the internals of the IDE framework such as Eclipse, Netbeans or Intellij IDEA. It is common for IDE support for one language to be largely unusable in the IDE support for another language without significant effort, even if the languages are similar.
This project will develop techniques for building lightweight IDEs where the key components of the IDE are easily replaceable and adapted instead of being part of a large, complex infrastructure. The aim is to make it easy for developers to add their own IDE operations that cooperate with others without having to invest large amounts of time becoming experts in a large IDE infrastructure. A prototype will be built to enable our Kiama language processing library to be used to help build these IDE components.
Formal Verification of Programs and Languages
Formal verification involves using discrete mathematical notations to specify what software is supposed to do. Tools such as proof assistant, checkers or theorem provers are then used to formally check that the software specifications have desired properties and establish a connection between the specification and an implementation.
These projects aim to conduct real verification of non-trivial software. Some prior experience with software verification of some kind is useful. For example, students wishing to undertake these projects should strongly consider taking the MRes unit current called COMP796 Advanced Topics in Theory and Practice of Software which teaches the Coq proof assistant.
Domain Specific Verified Programming Languages (Roberts)
Domain specific languages aim to make programming more accessible by providing specialised languages that are "smaller" and more specific than otherwise general purpose languages. Languages for which algorithms can be verified are even harder to use than general purpose languages, so its really important that we can write such domain specific verified languages. The project is the first part of our ongoing work to ensure Kiama is capable of building verified languages. In this project you will re-implement Cogent in Kiama (lets call it Kogent) to highlight any existing deficiencies.
Verification of Software Language Processing (Sloane)
Many systems and tools exist to build software language processors such as formatter, compilers and static analysis tools from high-level notations. E.g., we can use techniques such as attribute grammars to describe semantic analyses such as name and type analysis. We are developing our Kiama library to implement these kind of notation (see above).
But how do we know that our attribute grammars etc are doing the right thing? We can test, but can we prove that they are correct ? What does it mean for them to be correct? What is the specification? For name analysis we might specify that various static properties of name scoping would be obeyed such as constraints on the way in which nested scopes are handled.
This topic will investigate this area by developing specification techniques for language processing tasks. We will connect these specifications to a formal proof system such as Coq to enable us to write language processing implementations that can be proved to be correct. Ideally we would obtain a way to formally prove that Kiama-based components were doing the correct things.
Evaluating Dependently Typed Programming in AGDA (Verity)
On the one hand, AGDA is a dependently typed functional programming language, which owes much of its design to popular mainstream strongly typed languages such as ML and Haskell. On the other, AGDA is a proof assistant in which users may formulate mathematical propositions and construct machine verified proofs.
In the past dependently typed languages have mainly been used to construct formal mathematical proofs. For example the Coq language language has even been used (by Georges Gonthiers of Microsoft Research) to provide a completely machine checked proof of the infamous four colour theorem. Consequently, they have tended to be designed with the needs of mathematicians rather than programmers in mind. The AGDA designers, however, have consciously taken a different approach and have designed their language to meet the needs of functional programmers first and foremost. As such, it is claimed that AGDA provides a more comfortable and flexible environment for the development of formally verified programs.
Now this is an important claim, since formal program verification is known to be a very difficult art to master while being a vitally important component of mission critical applications development. After all, who wants to live next to a nuclear reactor whose software has not been engineered to the very highest standard or trust their life to an aircraft whose fly-by-wire systems have not undergone the most rigorous formal analysis?
We would like to examine the extent to which AGDA lives up to its design goals. In particular, we propose to put AGDA through its paces by using it to develop some simple verified software examples drawn from a range of different application areas.
The outcome of such a project would be a fully documented and verified piece of example software written in AGDA. Beyond the development of this code, you will be required to document your expected outcomes, comment fully upon any engineering decisions you made or pitfalls you may have encountered and describe the extent to which your code met your initial expectations. Examples of the kinds of example developments that we are thinking of may be found at http://web.student.chalmers.se/groups/datx02-dtp and at http://web.student.chalmers.se/~stevan/types.
If you would like to find out more about AGDA itself before picking this project, you might like to download its implementation from http://wiki.portal.chalmers.se/agda/pmwiki.php?n=Main.HomePage and have a go at working through a little of Ulf Norell's tutorial at http://www.cse.chalmers.se/~ulfn/papers/afp08/tutorial.pdf.
The Hidden cost of Pattern Synonyms
It is generally understood that pattern synonyms can be implemented in GHC with zero cost as long as suitably aggressive inlining and the case-of-case optimisation is used. We think there are some edge cases where this is not true. In this project we will identify these edge cases and quantify the run-time cost of pattern synonyms in this case. It is hoped this research will shed light on advanced pattern matching algorithms in modern FP and OO-FP hybrid languages such as Haskell, ML, Scala and F#.
Do Pattern Synonyms Subsume Scala's
Scala's pattern matching is not compiled, it is interpreted. Haskell with pattern synonyms is compiled. If pattern synonyms subsume enough of Scala's pattern matching, we have a path to compiled patterns for Scala (or for dotty).