Research Overview


My research interests concentrate on the theoretical foundations and effective engineering of Enterprise Systems: large-scale, distributed, business critical systems that extend across and between organisations. This has led to research contributions in Cloud computing (University of Melbourne); open, scalable infrastructure for the National Health Service (Oxford University); service-oriented architectures, Web service choreography (School of Informatics, University of Edinburgh); and optimising data movement in Terabyte scale repositories (National e-Science Centre). Complementing my academic experience, I have worked on projects at two industrial research laboratories, extended transactions at Hewlett Packard and formal methods at BAE Systems. My future research agenda addresses the challenges of engineering autonomic and on-demand Cloud computing technologies.


I have the following current research interests:


Research Projects


My systems research agenda has been pursued through the following academic and industrial projects:


1) Cloud Computing

Research Fellow, University of Melbourne: October 2009 - Current


In order to meet the increasing data storage and computing demands, applications are beginning to rely on cloud computing services: large-scale out-sourced data and compute resources made available to end-users and applications. Clouds are primarily optimised for very specific functionality, which means that data-intensive applications must increasingly interact with and utilise multiple specialised stand-alone clouds. My research at the University of Melbourne focuses on infrastructure to assist engineers in building and optimising applications which are constructed from multiple interoperating cloud services. This work is partnered with Microsoft.



2) Scalable Infrastructure for the National Health Service

Research Associate, University of Oxford: August 2008 - August 2009


At the University of Oxford I worked on a EPSRC/MRC grand challenge health-informatics project, partnered with the National Health Service (NHS), the University of Edinburgh and Imperial College, London. The premise: the sheer quantity and complexity of medical information, even within a single speciality, is beyond the power of one person to comprehend. However, the specific pieces of information most relevant to a particular clinical decision will typically be scattered over a wide range of databases, applications, journals and written notes. Centralisation of clinical knowledge is becoming rapidly less practical as the volume of data increases. My research at Oxford focused on designing, building and verifying open distributed systems to address the data explosion in medical informatics.


Key publications:


3) Orchestrating Data-Centric Workflows

Research Associate, National e-Science Centre, University of Edinburgh: March 2007 - August 2008


When orchestrating data-centric workflows centralised servers common to standard workflow systems (e.g. implementations of BPEL can become a bottleneck to performance. My research at the National e-Science Centre delivered the Circulate architecture, a light-weight hybrid workflow model which maintains the robustness and simplicity of centralised orchestration, but facilitates decentralised choreography by allowing services to exchange data directly with one another. An open-source, Web services based implementation serves as a live deployment platform. Performance evaluation conducted on the PlanetLab framework concludes that a substantial reduction in communication overhead results in a 2--4 fold performance benefit across common workflow patterns. As the complexity of a workflow grows, (i.e. workflow patterns are used in combination with one another) the advantage of using the hybrid architecture increases.


This post-doctoral research was conducted at the National e-Science Centre (NeSC) in collaboration with Jon Weissman from the University of Minnesota, Malcolm Atkinson and Jano van Hemert from the University of Edinburgh.


Key publications:


4) Web Service Choreography

School of Informatics, University of Edinburgh


My doctoral research at the University of Edinburgh focused on the design and implementation of a decentralised service choreography language, with formal semantics. My thesis made four key contributions to knowledge, the first is the directly executable MultiAgent Service Choreography language (or MASC). Secondly, MASC allows peers to coordinate in open-systems, specifications of choreography are disseminated at runtime, this allows peers to coordinate without any hard-coded or prior knowledge of the interaction pattern. Furthermore, services do not have to be altered prior to enactment (as is the case with current solutions, e.g. the W3C Web Services Choreography Description Language (WS-CDL), this offers greater flexibility and is a less invasive solution. Thirdly, sections of the choreography specification can be spliced in at runtime (e.g. the concrete services to call), allowing peers to operate in scenarios where it is not possible to explicitly define the pattern of interaction at design-time. Finally a practical contribution, MASC is implemented as a Java-based Web service choreography toolkit.


My doctoral work was primarily supervised by David Robertson, Christopher Walton and Austin Tate from the Centre for Intelligent Systems and their Applications (CISA) in the School of Informatics, University of Edinburgh.


Key publications:



Grid Computing: Supporting Large-Scale Science


I have been employed on a number of research projects on the UK e-Science programme focusing on the application of distributed computing techniques to solve problems in automating large-scale science. I attended the International Summer School on Grid Computing 2007


5) Gene Expression Studies in Early Human Development


Human embryonic material is extremely scarce world-wide and currently only two laboratories are licensed in the UK to collect and analyse human embryonic tissue. Dissemination of the results to the wider community is therefore vital for the progression of research. Developmental Gene Expression Map (DGEMap) was a EU-funded Design Study, which accelerated an integrated European approach to gene expression in early human development. Working closely with Susan Lindsay from the Institute of Human Genetics at Newcastle University and Richard Baldock from the MRC Human Genetic Unit at the University of Edinburgh, I took responsibility for the Informatics research and development which supports the integration of distributed gene expression data.


Key publication:


6) Virtual Observatories: Automating Astronomy


The concepts of workflow have recently been applied to automating large-scale science (or e-Science), coining the term scientific workflow. Business workflow tools look more like traditional programming languages, and are, in general pitched at the wrong level of abstraction for scientists to take advantage of. My doctoral research worked closely with Robert Mann, an astronomer from the Royal Observatory on a number of Virtual Observatory projects: AstroGrid and the Large Synoptic Survey Telescope (LSST). As a result, my doctoral thesis evolved a number of concrete use-case scenarios and identified a set of core requirements for scientific workflow systems. My open-source Multi Agent Service Choreography (MASC) framework was successfully applied directly to AstroGrid and LSST in order to demonstrate proof of concept and evolve the software.


Key publications:


Industrial Research and Engineering


Complementing my academic research, I have worked at two industrial research laboratories, focusing on engineering industrial strength systems which have immediate application.


7) Distributed Extended Transactions: Hewlett Packard


Traditional ACID (Atomicity, Consistency, Isolation, Durability) transactions work well in tightly coupled homogenous environments, where they are typically short lived and message delivery can be guaranteed. However these tightly coupled atomic semantics do not fit in well with the architecture of the Internet, where message delivery cannot be guaranteed and transactions may be long running. The OASIS Business Transaction Protocol (BTP) is an extended transaction model that allows coordination of resources which are exposed by multiple autonomous organisations. This model relaxes the traditional ACID properties and eradicates the exclusive locking of a resource by a transaction. A research project conducted at Hewlett Packard Arjuna, resulted in an implementation of the protocol and demonstrated that the CORBA Activity Service was a sufficiently generic framework to support this complex extended transaction model. This work was in collaboration with Mark Little, now CTO at JBoss.


8) Modelling Real-time Systems: BAE Systems


As a joint project between the University of Newcastle and British Aerospace Engineering (BAE Systems), my BSc thesis involved designing and implementing a modelling tool for analysing real-time formal models, used to design aviation software. Implementation involved the use of formal methods (VDM-SL) and Java. This work was supervised by John Fitzgerald at Newcastle University and Stephen Paynter at BAE Systems, Bristol.