Introducing EMF Diff/Merge – chat with Olivier Constant, project lead
As Eclipse Kepler arrived this week, Eclipse Magazin managed to secure big interviews. Diana Kupfer speaks to Olivier Constant, project lead for the EMF Diff/Merge Eclipse project
With the arrival of Eclipse Kepler this week, our sister publication Eclipse Magazin managed to secure interviews with some of the project leads. Diana Kupfer speaks to Olivier Constant, project lead for the EMF Diff/Merge Eclipse project and asks what it brings to the wider community.
Eclipse Magazin: The first thing that will likely come to mind when hearing “Diff/Merge” for the first time is “version control”. But surely there are other interesting use cases – can you briefly outline a few examples?
Olivier Constant: A diff/merge operation is required every time alternative variants of a model, or a subset of it, have to be reconciled into one. Besides version control and teamwork with models, many other use cases arise in model-based system design. One is related to the evolution and maintenance of large models. Tool support is needed to semi-automatically execute modeling tasks that would otherwise be repetitive and time-consuming. For that purpose, the ability to compare and merge subsets of models, including subsets of the same model, is fundamental.
Another certainly more well-known use case is incremental model transformation. Consider, for example, a workflow where a system design model is exported to a model for a specialty engineering discipline in order to be completed and evaluated by specialty engineers. Whenever the design model evolves, the specialty model needs to be updated accordingly, while preserving local modifications, which implies a merge.
What are the main challenges in merging models as compared to merging code?
No matter how incorrectly you merge code, it can always be opened afterwards in your editor of choice. This does not hold [true] for models. While code is only assumed to be a sequence of characters, models are complex data structures (graphs of model elements holding data) which are expected to verify certain consistency assumptions. These assumptions are not the same for every model: they are defined by a metamodel and, sometimes, additional consistency rules. Some assumptions are automatically enforced by model infrastructures, but not all. If a model violates an assumption, the usual modeling tool may consider it as corrupted and be unable to open it. Preserving consistency while merging models is thus essential.
The Diff/Merge engine was developed by Thales. How did the idea emerge?
Thales, as a large industrial group, is a place where a great variety of model-based processes and related situations are encountered. Over the years, it clearly appeared that merging models, whatever the context or the purpose, was a recurrent need, and that strong guarantees about consistency preservation and flexibility were needed. Beyond prototyping, what really allowed maturation efforts and open-sourcing to be carried out was the French AGeSys collaborative project. In this project we explore, among other things, team work with models.
In what kind of industry projects has it been used so far?
EMF Diff/Merge is now embedded into the Thales internal toolsuite for model-based design. It is integrated as a front-end tool for version control, but also as a back-end engine for a variety of other tools and features for which it was a real development booster.
The toolsuite is being used in a wide range of operational projects within the activity domains of Thales: aerospace and transport, defense and security. Given the interactions we have with system design teams, the domain where EMF Diff/Merge is used the most is probably aeronautics. However, like every other Eclipse open-source project, EMF Diff/Merge may be used in domains we are not aware of.
Isn’t the merging requirement already taken care of by other projects such as EMF Compare?
While EMF Compare is an integrated solution, EMF Diff/Merge is a component for building tools. In the EMF Diff/Merge vision, merging is a primitive operation in the field of model manipulation, transformation and evolution.
This is why EMF Diff/Merge is not tied to the way models are persisted and structured (EMF resources and containment trees). EMF Diff/Merge operates on “model scopes” which are arbitrarily-defined sets of model elements. In particular, what it means for a model element to be added to or removed from a scope may vary according to specific use cases. For example, we are working on an infrastructure that supports modeling patterns. In this context, model scopes are subsets of models defined by specific filters.
Obviously EDM is a great choice for collaborative modeling environments. But how large can such an environment be – is there a limit or an ideal project size?
This is a fundamental issue for all modeling technologies. We are starting to see more and more projects that involve large models which, for memory reasons, cannot be worked on without switching from 32-bit to 64-bit Windows OS. As far as merging is concerned, whatever the way models are persisted (files, databases), EMF Diff/Merge can take benefit of model scopes to operate on subsets of models which are as small as possible for the targeted use case, and as large as available memory allows.
So far, the level of abstraction at which EDM operates is a low, strictly technical one. Are there any plans to take the project to a higher level or will you be focusing more on improving the interoperability with higher-level comparison frameworks?
Differences between models can be represented at different levels of abstraction. At a low, technical level, it is simply an atomic piece of information which is present in one model and not in the other. At a high level, it may correspond to an editing action made by a user via a tool palette in a diagram, which typically encompasses several low-level differences.
The EMF Diff/Merge engine operates at a low level: that yields a very simple framework for merging, which is a good thing for soundness and consistency. By contrast, in a use case where a user has to understand differences and decide which ones to merge, it is better to expose high-level differences. In EMF Diff/Merge, these two objectives are allocated to different components: the engine for soundness, the default GUI for raising abstraction. Improving the GUI or bridging with higher-level frameworks are two non-exclusive possibilities we may explore.
Speaking of which: what are currently the strongest/most important ties to existing modeling projects at Eclipse?
The core part of EMF Diff/Merge directly depends on EMF, the core Eclipse modeling technology. The default GUI of EMF Diff/Merge integrates with the Compare framework of Eclipse. In addition, default model scopes for GMF, the diagramming technology, are provided. We are currently working on integration into / support for UML/SysML, CDO and the new Sirius project.
What’s on the roadmap for EDM?
In addition to all that I mentioned, we are going to improve the flexibility of the tool. The UI will be extended to support the easy definition of model scopes. The mechanism for matching model elements, which is based on a general notion of identifier that can be mapped to EMF identifiers, qualified names or anything else, will be used for providing alternative matching algorithms.
In the longer term, there are ideas we would like to explore: improving impact analysis during merge, making the engine reactive to model changes, or better exploiting the notion of delta.