Create A New Module

In this tutorial we detail how it is possible to extend NeoEMF with a custom backend implementation.

We define a simple mapping on a database named "Paprika", and we show how it can be integrated in NeoEMF and used by end users. Of course, all occurences of "Paprika" must be replaced by the name of your backend.

Several modules already exist, you can imitate them if you wish.

Summary

Project Creation

A Little Touch of Maven

A Pinch of Organisation

Implementation

Identification and Location

Configuration

Data Mapping

Overused Beans

Everyone Has Its Own Responsibility

Common Mapping Strategies

Your Mapping Strategy

Some Help

One Factory to Create Them All

Hello World!

Integration

In Tests

Test Context

Unit Tests

Integration Tests

In Eclipse Plugin

In Benchmarks

Pull request!

1. Project Creation

The first step before starting to implement your own backend connector is to clone the NeoEMF repository:

git clone -b master --single-branch https://github.com/atlanmod/NeoEMF.git

WARNING: We recommand to use the command above instead of the classical git clone, because developement and gh-pages branches are heavy and not required.

Once your local repository has been clones you can create a local branch that will be used later to submit a pull request:

git checkout -b paprika

Now you have everything setup to start implementing your own backend connector!

2. A Little Touch of Maven

NeoEMF is designed around a modular architecture based on three main components:

The core framework containing the main features of the framework. Users typically access NeoEMF through the core API.
The io component used to interoperate between different modules, and the external world.
The data components containing backend-related implementations, such as dedicated configuration, data mapping strategy, URI builer, etc.

This architecture allows to have a clear separation between shared code and backend-specific code. Supported backends are developped as independent Maven projects that extends neoemf-data.

To start our new implementation, we create a new folder paprika that will contain all the code related to our Paprika-based model mapping. Then create a new Maven project using the following command, or your IDE Maven environment:

mvn archetype:generate -DgroupId=fr.inria.atlanmod.neoemf -DartifactId=neoemf-data-paprika -DinteractiveMode=false

Then update the created pom.xml file with the following information:

[...]
<parent>
  <groupId>fr.inria.atlanmod.neoemf</groupId>
  <artifactId>neoemf-data</artifactId>
  <version>1.0.2-SNAPSHOT</version>
</parent>

<artifactId>neoemf-data-paprika</artifactId>

<packaging>bundle</packaging>

<name>NeoEMF Data Paprika</name>
<description>An in-memory backend using Paprika to represent models</description>

<properties>
  <!-- Ideal to put dependencies versions -->
</properties>

<dependencies>
  <!-- Think to use a dependencyManagement section -->
</dependencies>

<build>
  <plugins>
    <plugin>
      <groupId>org.apache.felix</groupId>
      <artifactId>maven-bundle-plugin</artifactId>
      <configuration>
        <instructions>
          <Bundle-SymbolicName>${project.groupId}.data.paprika</Bundle-SymbolicName>
          <Export-Package>
            fr.inria.atlanmod.neoemf.data.paprika.*
          </Export-Package>
          <Require-Bundle>
            ${project.groupId}.core
          </Require-Bundle>
        </instructions>
      </configuration>
    </plugin>
  </plugins>
</build>
[...]

This pom.xml specifies that the neoemf-data-paprika project is a sub-project of neoemf-data, inheriting all its dependencies, which include:

The core component of NeoEMF
The common component for Atlanmod projects
The JSR-305 implementation, for common annotations
The common component of EMF (for URI uses)
JUnit5 and AssertJ for testing

All the backend implementations have a similar root pom.xml file.

The build section of the pom.xml file tells Maven to generate an Eclipse bundle, and sets the generated MANIFEST.MF information such as the bundle name, the exported packages, and the other bundles that are required to use the generated one.

3. A Pinch of Organisation

Every modules respects a simple structure to organize the different classes.

In a package named fr.inria.atlanmod.neoemf.data.paprika (or use the base package of your organization), you should have the following file structure:

.
+-- config
|   |-- PaprikaConfig.java
+-- util
|   |-- PaprikaUri.java
+-- PaprikaBackend.java
+-- PaprikaBackendFactory.java
+-- DefaultPaprikaBackend.java

If you need more packages, feel free to add them.

And don't forget to respect the developers rules.

4. Implementation

All of the following steps can be performed in any order: all classes are related to each other. If a class is missing in your project, but present in the example, don't panic, it should appears in next steps.

4.1. Identification and Location

Every databases used in NeoEMF are associated to a dedicated URI scheme. This allows the framework to understand from a given resource URI which connector should be used to access the model.

Example: If the provided URI is neo-paprika:/path/to/my/resource/resource.paprika, the framework parses the scheme neo-paprika and associate the provided folder resource.paprika to the PaprikaBackendFactory.

The URI scheme is automatically created according to the name of the BackendFactory identified by the @FactoryBinding annotation. By default, URI schemes are prefixed by neo- to avoid clashes, followed by the BackendFactory#name().

The @FactoryBinding annotation is mandatory: it is used to bind a UriBuilder to a BackendFactory. This is used by the binding engine to retrieve a BackendFactory from a URI scheme, and vice-versa.

The code below shows the class PaprikaUri that extends the core class AbstractUriBuilder.

The AbstractUriBuilder class defines all methods related to URI creation, you don't need to re-implement these methods. It uses supportsFile() and supportsServer() to ensure that the created URI is supported by the BackendFactory.

IMPORTANT: Each implementation must have a static method named builder(), used to create a new instance.

@FactoryBinding(PaprikaBackendFactory.class)
@ParametersAreNonnullByDefault
public class PaprikaUri extends AbstractUriBuilder {

  /**
   * Constructs a new {@code PaprikaUri}.
   */
  protected PaprikaUri() {
  }

  /**
   * Creates a new {@link UriBuilder} with the pre-configured scheme.
   *
   * @return a new builder
   */
  @Nonnull
  public static UriBuilder builder() {
    return new PaprikaUri();
  }

  @Override
  protected boolean supportsFile() {
    // Is your backend use a file-based database ?
  }

  @Override
  protected boolean supportsServer() {
    // Is your backend use a distributed database, located by a URL ?
  }
}

4.2. Configuration

The configuration allows to define the database behavior, the data mapping strategy, etc. Everything that can be customized by the user must be declared there. Because this configuration can be saved in a file, to keep the state of the backend accross executions, it can also contains internal parameters.

All required options must be initialized in the constructor.

A Config subclass should respects the builder pattern, and each methods have to return the current configuration. The protected me() method can be used: it returns the current configuration in the right type, and avoid a class-cast for abstractions/sub-implementations.

As for UriBuilder, a Config implementation should be annotated with @FactoryBinding. It allows to retrieve it from the name of a BackendFactory by using reflection.

Note that the data mapping strategy is defined by giving the name of the class (see PaprikaConfig#withDefault()), this allow to save the mapping in a configuration file and retrieve it in a future executions. We will see later how to process it. All methods related to mapping strategies must be prefixed by with. If your module will only contain a single mapping, this method can be protected and initialized in the constructor.

Tip: You can use the createKey() method to create and assemble a composed key.

IMPORTANT: Each implementation must have a static method named newConfig(), used to create a new instance.

@FactoryBinding(PaprikaBackendFactory.class)
@ParametersAreNonnullByDefault
public class PaprikaConfig extends BaseConfig<PaprikaConfig> {

  /**
   * Constructs a new {@code PaprikaConfig}.
   */
  protected PaprikaConfig() {
    // Initialize the default values of this configuration
    withDefault();
  }

  /**
   * Constructs a new {@code PaprikaConfig} instance with default settings.
   *
   * @return a new configuration
   */
  @Nonnull
  public static PaprikaConfig newConfig() {
    return new PaprikaConfig();
  }

  /**
    * Defines the default mapping to use for the created {@link PaprikaBackend}.
    *
    * @return this configuration (for chaining)
    */
  @Nonnull
  protected PaprikaConfig withDefault() {
    // Because the mapping is a read-only option, always use `#setMappingWithCheck(***, false)` to avoid conflicts
    return setMappingWithCheck("fr.inria.atlanmod.neoemf.data.paprika.DefaultPaprikaBackend", false);
  }

  // Add other mappings (withLists, withMaps,...)
  // [...]

  @Nonnull
  @Override
  protected Predicate<String> isPersistentKey() {
    // Add some keys that have to be saved in a configuration file
    return super.isPersistentKey().or(s -> /* Check the configuration key */);
  }

  @Nonnull
  @Override
  protected Predicate<String> isReadOnlyKey() {
    // Add some keys that cannot be changed after their first definition
    return super.isReadOnlyKey().or(s -> /* Check the configuration key */);
  }

  // Add custom options (addNativeOption,...)
  // Several methods are available in `BaseConfig` to easily add options
  // [...]
}

4.3. Data Mapping

Then comes the more interesting part: the data mapping!

NeoEMF internally translates EMF methods into NeoEMF operations, represented as atomic queries, with a differentiation between attributes and references between elements, that use key/value representations instead of complexe objects. This allow to define a common behavior for all modules, and ease the integration with databases.

Overused Beans

When you will create your mapping strategy, you will met several beans:

Class	Description	Key	Value
`Id`	Represents the identifier of an element, with a 64-bit representation (`long` or hexadecimal string). Used as key for operations on containers and meta-classes, and as value for all operations related to references between elements.	X	X
`SingleFeatureBean`	Represents a single-valued feature (attribute or reference) of an element. It's a composed bean, with an `Id`, and the identifier of the feature as an `int`. Used as key for operations on single-valued features, and as value for operations related to containers.	X	X
`ManyFeatureBean`	Represents a multi-valued feature of an element. It's a composed bean similar to `SingleFeatureBean`, with the position of the feature. Always used as key for operations on multi-valued features.	X
`ClassBean`	Represents a meta-class of an element. It contains some methods to retrieve information about the real instance. It's a composed bean, with the name of the meta-class, and its associated URI. Always used as value for operations on meta-classes.		X

To manipulate these beans, several classes are provided in the core component:

**.data.bean.serializer.BeanSerializerFactory: A factory that creates optimized Serializers for each beans, if you need to use a binary representation of beans
**.core.IdConverters: A static class that creates Converters to transform an Id into its raw representation (Id to long for example, and vice-versa)

Everyone Has Its Own Responsibility

The data mapping strategies are used to translate NeoEMF operations into database operations. They contain a set of queries to access, store and manipulate a model. These operations take the form of atomic methods, such as valuef, valueFor, allReferencesOf, etc.

All these methods are referenced in several interfaces, where each one has its own responsibilities:

Class	Responsability	Multiplicity
`ContainerMapper`	container of elements	one-to-one
`ClassMapper`	meta-class of elements	many-to-one
`ValueMapper`	single-valued attributes of elements	one-to-one
`ReferenceMapper`	single-valued references between elements	one-to-one
`ManyValueMapper`	multi-valued attributes of elements	one-to-many
`ManyReferenceMapper`	multi-valued references between elements	one-to-many

To ease the integration, they are regrouped into a single interface: DataMapper, implemented by the Backend interface that you will use.

NeoEMF allows to use several data mapping strategy for a same component. The different mapping strategies don't have to be compatible with each other: The mapping strategy is saved in the configuration file next to the database (only for file-based backends), so, the mapping compatibility is ensured accross several executions: the user will not be able to use a mapping different from the one previously defined.

NOTE: The values used with ValueMapper and ManyValueMapper are only primitives (int, String, boolean,...). Complex objects are converted before any call to these classes. The only exception concerns arrays and lists if you want to use a predefined mapping strategy (see next section). Make sure your database supports them before using them.

Common Mapping Strategies

Some common data mapping strategies can be used to simplify your development, but they are optional.

The first set corresponds to references redirection, where they are processed as values after a conversion to/from the desired type (Id ↔ <T>) with a Converter. This is useful if you don't plan to use a different mapping for attributes and references. The IdConverters class in the core component might be useful in this case.

Class	Description	Redirection
`ReferenceAs<T>`	Redirects all calls related to single-valued references	`ReferenceMapper` → `ValueMapper`
`ManyReferenceAs<T>`	Similar to `ReferenceAs<T>`, but with multi-valued references	`ManyReferenceMapper` → `ManyValueMapper`
`AllReferenceAs<T>`	A combination of `ReferenceAs<T>` and `ManyReferenceAs<T>`	--
`ManyReferenceMergedAs<T>`	Merges a set of multi-valued references into a single entity of type `<T>`, then processes the result as a single-valued attribute. For example, you can use a string representation of a `List<Id>`	`ManyReferenceMapper` → `ValueMapper`

The second set corresponds to multi-valued attributes redirection.

Class	Description
`ManyValueWithIndices`	Each multi-valued attribute is processed separately. It's a shortcut method that use `valueOf()` and a variant of `valueFor()` directly to manipulate the database, and avoid implementating `addValue()`, `removeValue()`, etc.
`ManyValueWithArrays`	Groups a set of multi-valued attributes into an array, then processes the result as a single-valued attribute. The position of a multi-valued feature is defined by its position in the array.
`ManyValueWithLists`	Similar to `ManyValueWithArrays`, but using `List` instead of arrays.

Their behavior is not definitive, and can be re-implemented to fit your ideal. They are provided as interfaces, and can be combined with others, if they don't conflict (using both ManyValueWithLists and ManyValueWithArrays together will never be a good idea).

Your Mapping Strategy

First, creates the base interface of your module.

For now, it only contains some methods to define the nature of your backend, but it could contains more methods in future.

@ParametersAreNonnullByDefault
public interface PaprikaBackend extends Backend {

    @Override
    default boolean isPersistent() {
        // Is your backend persistent ?
    }

    @Override
    default boolean isDistributed() {
        // Is your backend distributed ?
    }
}

Then, creates the base class of your module.

Its goal is to provide a base for all data mapping strategies related to your module, so it should contains methods for database initialization and native operations. It can also contains methods common for all your mapping, such as save(), close(), copyTo(), or the mapping of containers and meta-classes.

@ParametersAreNonnullByDefault
abstract class AbstractPaprikaBackend extends AbstractBackend implements PaprikaBackend, AllReferenceAs<Long> {

    /**
     * Constructs a new {@code AbstractPaprikaBackend}.
     */
    protected AbstractPaprikaBackend() {
      // Initialize the database
    }

    @Override
    public void save() {
        // Save the last modifications
    }

    @Override
    protected void innerClose() {
        // Cleanly close the database and release all associated resources
    }

    @Override
    protected void innerCopyTo(DataMapper target) {
      // This method is called only when `this.getClass() == target.getClass()`
      AbstractPaprikaBackend to = AbstractPaprikaBackend.class.cast(target);

      // Copy the database of this backend to the database of the target
    }

    @Nonnull
    @Override
    public Converter<Id, Long> referenceConverter() {
        return IdConverters.withLong();
    }
}

Finally, creates the data mapping by implementing all inherited methods.

@ParametersAreNonnullByDefault
class DefaultPaprikaBackend extends AbstractPaprikaBackend implements ManyValueWithIndices {

  /**
   * Constructs a new {@code DefaultPaprikaBackend}.
   */
  protected DefaultPaprikaBackend() {
    super();

    // Initialize more...
  }

  // Implements all methods
  // [...]
}

Some Help

To ease your development, you can find utility classes and methods in the fr.inria.atlanmod.commons module:

Some preconditions, based on Guava
An asynchronous logger
An efficient in-memory cache, on top of a Caffeine cache
Some lazy objects that loads on-demand their value
Efficient hashers, on top of Zero Allocation Hashing; including Murmur3, xxHash, cityHash or farmHash algorithms
Several methods related to concurrency, collections, arrays, stream, primitives, etc.

4.4. One Factory to Create Them All

All backends of a same module are created in a single place: the BackendFactory. It's the core of a module.

It's a simple class that process the URI built with the PaprikaUri and the PaprikaConfig -- given in parameters when using Resource#load() or Resource#save() -- in order to create a PaprikaBackend. The URI is used to locate the database, while the configuration is used to define the expected behavior of the backend.

As a reminder, the URI scheme is built from the factory's name, so, the name of a BackendFactory must be unique. See the reserved schemes to check that the name of your factory is not already used.

The code below shows a common usage of a BackendFactory, with URI/configuration analysis.

To create the Backend instances, we use reflection: the mapping is defined and stored in the configuration as the fully-qualified name of the Backend class. To instantiated it, you have to use the createMapper() method: The argument correspond to the mapping defined in the PaprikaConfig, and the constructor parameters. These depend on your implementation.

IMPORTANT: Each implementation must have a static method named getInstance(), used to create a new instance.

@ParametersAreNonnullByDefault
public final class PaprikaBackendFactory extends AbstractBackendFactory {

  /**
   * The literal description of the factory.
   */
  public static final String NAME = "paprika";

  /**
   * Constructs a new {@code PaprikaBackendFactory}.
   */
  protected PaprikaBackendFactory() {
  }

  /**
   * Returns the instance of this class.
   *
   * @return the instance of this class
   */
  @Nonnull
  public static BackendFactory getInstance() {
    return Holder.INSTANCE;
  }

  @Override
  public String name() {
    return NAME;
  }

  @Override
  public Backend createBackend(URI uri, ImmutableConfig baseConfig) {
    // The `uri` can be used to locate the database, it corresponds to the `Resource#getUri()`
    // The `baseConfig` contains all options defined when using `Resource#load()` or `Resource#save()`

    PaprikaBackend backend;

    try {
      // Converts the `uri` to a file-based path
      Path baseDirectory = Paths.get(uri.toFileString());

      // Loads an existing configuration, merge and check conflicts
      ImmutableConfig mergedConfig = Config.load(baseDirectory)
        .orElseGet(PaprikaConfig::newConfig)
        .merge(baseConfig);

      // Retrieve the mapping defined in the configuration
      String mapping = baseConfig.getMapping();

      // Is the read-only mode has been configured ?
      boolean isReadOnly = mergedConfig.isReadOnly();

      // Initialize the database
      // [...]

      // Create the mapping on top of the created database with its arguments
      backend = createMapper(mapping, arg1, arg2,...);

      // Save the configuration to ensure the compatibility in future executions
      // It is also used when loading a resource from EMF
      mergedConfig.save(baseDirectory);
    }
    catch (Exception e) {
      throw new InvalidBackendException("Unable to create the Paprika database", e);
    }

    return backend;
  }

  /**
   * The initialization-on-demand holder of the singleton of this class.
   */
  @Static
  private static final class Holder {

    /**
     * The instance of the outer class.
     */
    static final BackendFactory INSTANCE = new PaprikaBackendFactory();
  }
}

5. Hello World!

Now, you can test your new backend by creating a NeoEMF resource using the classes we defined before.

We create a new default PaprikaConfig, without any additional parameter, then locate the resource by using the PaprikaUri to identify our PaprikaBackendFactory.

You can then test your implementation by adding elements, save, load, traverse a resource, or whatever you want.

ImmutableConfig config = PaprikaConfig.newConfig();
URI uri = PaprikaUri.builder().fromFile(***);

ResourceSet resourceSet = new ResourceSetImpl();
Resource resource = resourceSet.createResource(uri);

// Only for an existing resource
//resource.load(config.asMap());
//resource.getContents()

// Do something on the resource
// [...]

resource.save(config.asMap());
resource.unload();

6. Integration

In Tests

NeoEMF comes with a set of unit tests and integration tests used to ensure the correct behavior of a backend with or without EMF.

Test Context

All tests are based in a Context, which is a simple class that defines the behavior of your module.

It includes several methods to initialize (useful when using a distributed database), identify or create objects related to a module.

A Context can be used for several data mapping strategy, defined by the Context#config() method. For example, the method PaprikaConfig#getDefault() returns the default configuration to create a DefaultPaprikaBackend. You should have as many similar methods as there are backends in your module.

In tests, create a fr.inria.atlanmod.neoemf.data.paprika.context package, then add the following class.

@ParametersAreNonnullByDefault
public abstract class PaprikaContext extends AbstractContext {

  /**
   * Creates a new {@code PaprikaContext} with a mapping with indices.
   *
   * @return a new context.
   */
  @Nonnull
  public static Context getDefault() {
    return new PaprikaContext() {
      @Nonnull
      @Override
      public Config config() {
        return PaprikaConfig.newConfig();
      }
    };
  }

  // Add all other mappings as before
  // [...]

  @Nonnull
  @Override
  public String name() {
    // The display name of your module
    // Re-implement it in `PaprikaContext` subclasses if you use several mappings
    return "Paprika";
  }

  @Nonnull
  @Override
  public BackendFactory factory() {
    return PaprikaBackendFactory.getInstance();
  }

  // Re-implement default methods if necessary
  // [...]
}

Unit Tests

Then create the following unit tests, by extending the existing ones.

NOTE: If you have to create some tests that don't inherit from an existing one, use AbstractTest as base class. You can also inherit from AbstractUnitTest if you need a Context, or AbstractFileBasedTest if you need a temporary file.

Most test-cases don't require any additional test, but don't hesitate to add some if you wish. If you want to disable an inherited test, simply override it and annotate it with @Disabled with thre reason.

The following test-case ensure the creation of a URI with your UriBuilder.

@ParametersAreNonnullByDefault
class PaprikaUriTest extends AbstractUriTest {

  @Nonnull
  @Override
  protected Context context() {
    return PaprikaContext.getDefault();
  }
}

The following test-case ensure that the correct backend is created from a given Config.

@ParametersAreNonnullByDefault
class PaprikaBackendFactoryTest extends AbstractBackendFactoryTest {

  @Nonnull
  @Override
  protected Context context() {
    return PaprikaContext.getDefault();
  }

  @Nonnull
  @Override
  protected Stream<Arguments> allMappings() {
    return Stream.of(
      Arguments.of(PaprikaConfig.newConfig(), DefaultPaprikaBackend.class)
      // Add all other backends of your module with their corresponding configuration
      // [...]
    );
  }
}

Then, the most important test: the data management test! The following test-case ensure the data integrity when using your module by checking every methods at a low-level.

IMPORTANT: You have to create as many classes as there are backends in your module.

@ParametersAreNonnullByDefault
class DefaultPaprikaBackendTest extends AbstractDataMapperTest {

    @Nonnull
    @Override
    protected Context context() {
        return PaprikaContext.getDefault();
    }
}

Integration Tests

Finally, the ultimate step. You need to include your module in integration tests, based on EMF resources.

Defines the dependencies in the neoemf-tests module, by including your module, and its associated test-jar variant.

<dependencies>
  <!-- [...] -->

  <dependency>
    <groupId>fr.inria.atlanmod.neoemf</groupId>
    <artifactId>neoemf-data-paprika</artifactId>
    <version>${project.version}</version>
  </dependency>

  <dependency>
    <groupId>fr.inria.atlanmod.neoemf</groupId>
    <artifactId>neoemf-data-paprika</artifactId>
    <version>${project.version}</version>
    <type>test-jar</type>
    <scope>test</scope>
  </dependency>

  <!-- [...] -->
</dependencies>

Then, simply add all PaprikaContext implementations in the fr.inria.atlanmod.neoemf.tests.provider.ContextProvider#allContexts() method.

That's all, your module is ready. Congratulations!

In Eclipse Plugin

For now, skip this part. You can use the following tips if you're brave, and imitate the existing modules.

-- TODO

All following paths are based on the plugins/eclipse directory:

Add the Eclipse feature: Create a new directory features/fr.inria.atlanmod.neoemf.data.paprika.feature that contains:
- build.properties: Only contains bin.includes=feature.xml
- feature.xml : The configuration of the Eclipse feature
- pom.xml : The configuration of the Maven module, built as an eclipse-feature, with its dependencies
Update features/pom.xml with the previously created Eclipse feature (under modules)
Update the update-site generation in update
- pom.xml : Add the dependency of the previously created Eclipse feature
- category.xml: Add a feature in the backend category
Update examples

In Benchmarks