Don’t generate (everything). Or: Framework beats Generator

Posted on Jan 7, 2016. Updated on Jan 20, 2019

When we apply Model-Driven Software Development (MDSD) we write a generator which produces code out of a model. The promise is that among others, we can reduce the boilerplate code and accelerate the development. However, MDSD is not a cure-all and should be applied with sound judgment. In this post I cover some drawbacks of the generator approach, anti-patterns and present an alternative to generators: frameworks.

TL;DR

Whether the creation of an own generator makes sense depends on your use case and domain. There is no universal answer. If you are using your own generator and you are happy with it – perfect! Carry on!
- Especially in cases where domain experts are using a DSL, an own generator can make perfect sense. But when a DSL is used by software developers to model entities, the benefit can be questioned.
- Besides, the advantage of a generator depends on how regular and uniform your domain and application is and how much flexibility you need. To my mind, the reality is rarely uniform.
Think twice before you start writing your own generator for modeled domain entities
- Be aware of the huge impact this will have on the build and development infrastructure. Also take the long-term commitment to the model-driven approach and the technology stack into account.
- Analyze the saving you gain by the generate code. Is the loss of flexibility (you would have if you hand-write the code) justified? Can the same saving of boilerplate code be achieved by using a framework or writing same reusable classes?
- Consider writing an own lightweight framework/library or using powerful existing frameworks as an alternative to generators. Today’s frameworks reduce the added value of a generator.
If you still want to generate, try to keep the generated layer as thin as possible. This simplifies the generator and model. Moreover, you can still benefit from the flexibility of hand-written code.
Especially generating standard GUIs can become fairly painful, because they rarely fulfill all requirements.

Scope of this Post

This post covers only my personal opinion, which bases on the experience I made in my last projects.
Code generation in general is no problem at all (e.g. using Swagger to generate a Client API or QueryDsl to generate QObjects). In this case you are using existing libraries and don’t write a generator by your own. I focus on model-driven software development here (MDSD), where you code your own generator.
I exclusively refer to domain models describing the entities and their properties. Here’s an example (adapted from the xtext 15 minutes tutorial):

entity Blog {
  title: String
  many posts: Post
}

entity Post {
  title: String
  content: String
  author: String
}

Let’s assume you want to create a typical web application with a GUI, business logic and database.
There are other use cases where you can benefit from DSL and an own generator. For example, when you want to enable domain experts (non-programmers) to “program” by their own by using an abstracting DSL (e.g. describing domain rules). The domain experts are getting actively involved in the development process. In this case you can benefit a lot from MDSD. However, in this post I refer to cases where your DSL is used by software developers to model domain entities.

MDSD has a Huge Impact on Development and Build Infrastructure

I made the experience that integrating a generator into the development and build infrastructure can be hard and cumbersome. Sure, it’s a no-brainer for simple projects: just copy and paste from the getting started tutorial. But the reality is rarely simple and full of special cases. The devil’s in the details. I worked with Eclipse xtext for a while which can be a powerful tool if used in the right place. But I run into some challenges:

We have to add the generation step into the build. That’s not only a matter of configuration effort, but it can also slow down the build.
We have to align the Eclipse and Maven build for the xtext language, so that they behave the same way and produce the same results. This is sometimes not easy at all.
There are always troubles with the clumsy and slow p2 repositories/update sites (Eclipse PDE). You’ll probably need an artifact repository (like Nexus or Artifactory) serving as a proxy or you have to copy the remote p2 repositories. Moreover, it’s a lot of try-and-error until you have all (transitive) dependencies in the correct version you need.
Tycho (a tool for building Eclipse artifacts) can be a nightmare, especially aligning Maven dependencies with the Eclipse target platform.
You need to maintain a separate project for your DSL, which produces the Maven generator and the Eclipse Plugins. Building and especially releasing this DSL project is different from normal Maven projects and requires extra work.
You need to take care of synchronizing the generator version in the Maven POM files with the one that is installed in the Eclipse IDEs of your developers. This requires a coordinated approach and tools to distribute new plugins version consistently (like Oomph). Hence, changing and distributing the generator is laborious.
You need to configure your tools for static code analysis (Findbugs, Checkstyle, PMD, SonarQube, JaCoCo) to skip the generated sources.
And all of this stuff needs to run on your workstation and on your CI server (like Jenkins). To my mind, there is always something that doesn’t work on the Jenkins out of the box.

All in all the complexity of your build and development infrastructure is increasing. You have to introduce a big tool chain to solve problems you won’t have without MDSD. Analyze carefully if this complexity is justified in your case.

Tied to Tool Chain and Technology Stack

The decision for a model-driven approach is a long-term commitment determining big parts of your tool chain and technology stack. Sure, a decision for a certain framework is also something that can’t be changed easily. But as the model-driven approach also impacts the build and development infrastructure the commitment is higher.

For instance, if you go with xtext you’ll probably have to use Eclipse as your IDE. But the golden times of Eclipse as the only real Java IDE are over. Fortunately, for some months there is support for IntelliJ available which is absolutely great, but still not as mature as the Eclipse support. But what about NetBeans users? I know projects that have rejected xtext, because they don’t want to be locked into the Eclipse ecosystem.

Don’t Try to Generate Everything

We can easily run into troubles when we try to generate everything out of our domain model:

Trying to generate as much as we can out of the domain model.

Is it possible to fulfill every requirement with the generated generic DAOs and Views? Are my domain and use cases really that uniform? Not in my experience. Just think about specific functionality that goes behind CRUD or performance optimized queries. The reality is often much more varied and diverse.

Not sufficient. Especially the data-centered universal CRUD GUIs are barely satisfying. Often we want to add special behavior or create views that are optimized for a certain domain workflow or user experience. Those views can’t be generated in a generic way.
Complex code base. The generator and the model become very complex and cumbersome, because we need to code many special cases into them. It will be hard to maintain.
Complex APIs. Some requirements (like access control) can make the APIs (of the DAOs) complex. This might be ok in cases where you need the control, but can be a pain when you don’t need it.
Polluted Model. I experienced that the one-model-for-everything-approach leads to a polluted model. I saw cases where Java code (business logic) or optimized SQL queries has been written directly into the model to fulfill certain requirements. This is the point where things are getting very ugly. This makes tracing and editing functionality very difficult (no compiler check, no tooling, laborious workflow to change the code).

What can we do instead? I made very good experience with a thin generated layer, which simplifies the model and the generator. To still provide an accelerated development speed, I propose to invest into frameworks that makes it easy to develop functionality (like DAOs or GUIs). Consider the following alternative:

Generate only a small layer and use own libraries to accelerate development of specific DAOs and GUIs.

We only generate the POJOs and the QObjects for QueryDsl. QObjects contain the metadata for each entity table and allows to write type-safe queries in Java. Moreover, they allow us to easily write generic CRUD DAOs which can be reused by our specific hand-written DAOs. Besides, we invest into a little GUI framework (which bases on our third-party GUI framework of choice; Vaadin for example) and create reusable classes that allows an accelerated development of views while still giving us flexibility, because we have direct access to the underlying third-party framework. Equipped with this small “framework” we can handle upcoming requirements much better.

Powerful Frameworks and Libraries reduce Added Value of a Generator

The term “framework” subsumes two things for me:

Your intern frameworks and libraries
Third-party frameworks and libraries

One could claim, that we only move the complexity from the generator to the framework. Now we have to design and maintain our own framework or learn a (maybe complex) third-party framework. Is this better? Yes it is, because the framework is just another project dependency having no impact on the build and development infrastructure (see above).

Intern Frameworks and Libraries

Your intern “framework”. It’s almost never a good idea to write a whole ORM- or GUI-Framework by your own. I mean more simple things: Some base classes, injectable classes or an additional layer between your application-specific code and the third-party frameworks can significantly speed up the development.

Just think about an abstract DAO class providing basic CRUD methods. We can extend them and create domain specific methods which perfectly fit into our business requirements.

The same approach can be applied to the GUI. Let’s assume that we are using Vaadin as a GUI framework and many of our views are classical master-details-views. In this case we can create an abstract class MasterDetailsView which already contains a master table and a details field group. We can benefit from this class and still have access to the whole functionality of Vaadin. Moreover, we can create configuration objects allowing us to configure the base functionality provided by the abstract class.

public class View extends MasterDetailsView {

    @Override
    public MyConfiguration createConfiguration(){
        return MyConfiguration.createForEntity(entity)
            .setFilter(...)
            .setSorting(...)
            .allowModification(true)
            .allowDeletion(true)
            .sortDetails(...)
            .property(entity.property1).readOnly(true);
    }

    @Override
    public Component doInit(Component basicContent){
        //add view-specific components...
    }
}

As we still hand-write our Java GUI code, we can easily add view-specific behavior for special cases. This provides us flexibility.

Third-Party Frameworks and Libraries

There are really powerful frameworks out there. We need to write less and get more. I like to give some examples for improved framework support that reduces the need for generated code.

Data Access Layer

A long time ago coding the persistence layer was laborious. Code generation has been used to generate the data access layer for your entities (typically by doing some object-relational mapping). But even if you already use an ORM-Framework like Hibernate, you could at least generate the annoying XML files and the POJOs. But for many years there are annotations available. Just add an @Entity annotation to your hand-written POJO and you can pass it to your JPA/Hibernate EntityManager. That’s all. Hibernate knows how to deal with your object.

@Entity
public class Article {
    @Id
    @GeneratedValue
    private int id;
    private String name;
    //...
}

// usage:
EntityManager entityManager = ...
entityManager.persist(new Article());

Sure, for more sophisticated functionality (relationships, converting, lazy/eager loading) you need more annotations. But when you code it manually, you have the full JPA functionality to play with (no restrictions) and thanks to annotations there is less boilerplate code. The added value of a generation layer is questionable. Moreover, JPA can also set up a database for you, so there are often no generated SQL scripts necessary.

The same is valid for the non-relational world. For instance, when it comes to MongoDB there are really nice ODMs available. MongoJack provides a nice programming experience and requires only little code. Consider the following minimal POJO for MongoJack:

public class Product {
    private String name;

    public Product(String name) {
        this.name = name;
    }

    public String getName() {
        return name;
    }
}

//usage:
JacksonDBCollection<Product, String> productCollection = ...
productCollection.insert(product);

With Spring Data you can easily create whole CRUD-Repositories for various databases. For instance, with Spring Data MongoDB you only need an annotated POJO and the interface for your repository. Spring Data creates an implementation for you by analyzing the signature and the method names of your repository interface. See here for details.

import org.springframework.data.annotation.Id;

public class Article {
    @Id
    private String id;
    private String name;
    private String category;
    //...
}

public interface ArticleRepository extends MongoRepository<Article, String> {
    Article findByName(String name);
    List<Article> findByCategory(String category);
}

Service Layer

Spring Data REST simplifies the creation of HATEOAS-compliant RESTful CRUD services that are backed by a database. You only have to add some annotations to the repository interface above. See here for details.

@RepositoryRestResource(collectionResourceRel = "article", path = "article")
public interface ArticleRepository extends MongoRepository<Article, String> {
    List<Article> findByName(@Param("name") String name);
    List<Article> findByCategory(@Param("category") String category);
}

However, the Spring libraries work best if you fully venture on the Spring stack. An isolated use can be cumbersome. Besides, Spring is sometimes magic, but magic can be tempting. ;-)

Other Examples

Dealing with the Java boilerplate (getters, setters, constructor, toString(), equals(), hashCode())
- Languages like Kotlin address the verbosity of Java. With Kotlin’s data classes you don’t have to write the usual boilerplate. Check out my blog post about Kotlin if you want to learn more. The bottom line is that choosing the right language may remove the demand for a generator, because there is no boilerplate at all.
- Take a look at Project Lombok. With this dependency you only have to add some annotations in your class. Lombok intercepts into the compilation process and adds the usual boilerplate into the bytecode. No need to write an own (source) generator.
- If you like to stick to Java: IDEs like IntelliJ or Eclipse generate getter, setter, toString(), equals(), hashCode() for the fields of your class. The boilerplate code still exists, but you don’t have to write them manually. But you definitely don’t need a dedicated generator for them. However, you still have to maintain the methods manually.
Dropwizard and Spring Boot make it very comfortable to create a production-ready (RESTful) service without much boilerplate code. Micro web frameworks like Spark, Jodd, Ninja or Jobby are even more lightweight and require minimal code, but provide less functionality.
Jackson serializes Java objects to JSON and vice versa. You only need a simple POJO with fields, getter and setter. That’s all, even annotations are not always necessary.
Swagger-inflector makes creating a JAX-RS-based server for a given Swagger spec easy, because there is no JAX-RS stub generation anymore. It takes care of the wiring and redirects requests to your business logic at runtime.