bookmark_borderAutomating your SQL migrations in your Java (web) Application with ActiveJDBC’s DB Migrator

Regardless of your language or framework – if you write a (web) app and read/write data to a SQL database (you manage), you’re bound to have changes on that database for whatever reasons.

Usually this is tackled by writing SQL migration scripts. For instance SQL files you can execute in your release procedure. You might pass these SQL files along with a ‘release document’ for a separate (OPS) department.

But what if we could automate this? Wouldn’t it be much nicer your application could detect which SQL files to execute, and then execute them? And when things break, it stops.

This blog post covers one of the many ways to do this. I start from a more general process and move to a more concrete Java implementation.

An automated deploy process

In this case I assume the following:

  1. You are not allowed to run any build process on your environment you deploy on. Ie there is no maven on your prod environment (nor acceptance or test).
  2. Up until now you execute SQL migrations manually. These are SQL snippets you have and you use a SQL editor (or command line) to execute them.
  3. You want to automate these steps. (of course you do!)

So let’s make the hand by hand process a bit more formal:

  1. Write your own SQL script (You execute the bits you need, or you write it in such a way you can execute it fully all the time.)
  2. You stop your (web)app
  3. You open your SQL editor / prompt and execute your SQL script
  4. On failure, you roll back and reboot your app (and figure out what went wrong)
  5. On success you boot your (web)app

And what would an automated SQL migration process look like?

  1. It should use a SQL file that can be executed multiple times and have the same result (Ie the 2nd, or 3rd time should not result in other behaviour. Ie: Idempotent)
  2. When your (web) app boots it should check if there are any SQL files left to execute
  3. The SQL files are executed in a specific order (from oldest to newest)
  4. It remembers which migrations have been executed
  5. Upon failure, stop executing migrations and stop booting

 

Notice the similarities!

Of course, you should have created a backup before doing any migration. Though that is out of scope of this post. It usually boils down to:

  • create backup
  • execute release procedure
    • upon fail, report back. Put back backup, and boot old app.
    • upon success, good for you!

Using ActiveJDBC’s DB Migrator

I have worked on a few projects using Ruby On Rails (abbreviated as RoR), which uses rake as build tool (kinda like maven). With that you could easily create new database migrations, execute migrations, etc. I was charmed by the simplicity.

In the land of Java, where I have used mostly Spring and Hibernate – I found ActiveJDBC (which is part of Javalite) which seems to be inspired by the approach RoR has taken.

Within ActiveJDBC there is also the usage of a migrator which reminds me of RoR’s rake db:migrate approach. It is a maven plugin and you call mvn db-migrator instead. It is very nice to create new migrations or run them.

Now the running part is the problem. While on your local machine you can run mvn db-migrator, I would not suggest to do that on my production (or any other) environment. So we need a way to run the migrator without the usage of maven.

Fortunately this is easy to fix.

Let’s get into some code.

 

Setting up ActiveJDBC in your Maven project

If you follow ActiveJDBC’s documentation it is easy to set up within your project. So I take that here, but expand upon it.

In your pom.xml you add the following as a plugin within the plugins section:

<!-- for creating db migrations with maven -->
<plugin>
  <groupId>org.javalite</groupId>
  <artifactId>db-migrator-maven-plugin</artifactId>
  <version>${activejdbc.version}</version>
  <configuration>
    <migrationsPath>src/main/resources/sql/db-migrations</migrationsPath>
    <environments>MY ENVIRONMENT</environments>
  </configuration>
  <dependencies>
    <dependency>
      <groupId>${jdbc.groupId}</groupId>
      <artifactId>${jdbc.artifactId}</artifactId>
      <version>${jdbc.version}</version>
    </dependency>
  </dependencies>
</plugin>

Where there are a few details:

jdbc.groupId, jdbc.artifactId and jdbc.version is your db connector. It could be mysql for instance. You probably have this dependency already defined in your project. So take that and put it there.

For activejdbc.version I use now 2.0.

One other thing to note is that I have deviated a bit from the default migrationsPath. In my case I put the sql files into src/main/resources/sql/db-migrations.

If you set this up, you can now create a migration. Test it by executing:

mvn db-migrator:new -Dname=<your name for your migration script>

The naming of migration scripts does not really matter – although it is good to have a structure in it. Usually I tend to write them as <action>_<type>_<name of column/table>_extra_info. Which results into something like: create_table_Books.

Suppose you took the last example as input:

mvn db-migrator:new -Dname=create_table_Books

It will result into:

/src/main/resources/sql/db-migrations/20180711170817_create_table_Books.sql

As you can see, a timestamp is prepended to the name you have given.

The timestamp is used by ActiveJDBC to determine if the SQL script should be executed or not. It basically keeps a record of the last time it executed any scripts. Then it would diff against the sql scripts it found. Any newer scripts are executed in the correct order (oldest to newest). In other words, do not touch the timestamp.

I’d suggest you play a bit with the plugin first, before going to the next step.

 

Executing the SQL Migrations from your (web) application

Now the real fun begins.

Include db-migrator as dependency in your project

So not as a plugin but also as a dependencyLike so:

<!-- DB Migrations -->
<!-- yes we use it as a plugin but also as a dependency! -->
<dependency>
  <groupId>org.javalite</groupId>
  <artifactId>db-migrator-maven-plugin</artifactId>
  <version>${activejdbc.version}</version>
  <exclusions>
    <exclusion>
      <groupId>org.sonatype.sisu</groupId>
      <artifactId>sisu-guava</artifactId>
    </exclusion>
  </exclusions>
</dependency>
<!-- DB Migrations -->

**NOTE** I exclude the sisu-guava version because it is an older version than 20.0 and it will cause RuntimeExceptions when using Preconditions. Since I use version 25.x or newer in my own projects it clashes with the old version. So if you use Guava yourself and it is a newer version, make sure you exclude the old one like above.

Using it as a dependency makes it reachable for our code, while at the same time we can use the mvn db-migrator command alongside it.

 

Making sure your migrations run before your ORM is loaded

Usually web apps use an ORM (like Hibernate) and it requires the state of the DB to match the objects within the code. So your migrations (which alter state of the DB) should be executed before the ORM is booted.

So let’s start with that. You do this by adding a listener to your web.xml and place it at the top of all listeners. Making sure it is executed first.  If you do not have any listeners, in my case they are positioned right after filters but before servlet definitions.

<!-- order is important, the database migrator must run before everything else -->
<listener>
  <listener-class>com.fundynamic.webapp.listener.DatabaseMigratorListener</listener-class>
</listener>

Now we have to make sure the class exists. Create the class, and let it implement ServletContextListener like so:

package com.fundynamic.webapp.listener;

import javax.servlet.ServletContextEvent;
import javax.servlet.ServletContextListener;

public class DatabaseMigratorListener implements ServletContextListener {

  // Public constructor is required by servlet spec
  public DatabaseMigratorListener() {
  }

  // -------------------------------------------------------
  // ServletContextListener implementation
  // -------------------------------------------------------
  public void contextInitialized(ServletContextEvent sce) {
  }

  public void contextDestroyed(ServletContextEvent sce) {
      /* This method is invoked when the Servlet Context 
         (the Web application) is undeployed or 
         Application Server shuts down.
      */
  }
}

I’d suggest you put some System.out or logging there and see if it is executed in your application in the correct order before going to the next step.

Executing the migrations

Here we get into the meat of the problem.

Let’s start with the code

package com.fundynamic.webapp.listener;

import org.javalite.db_migrator.maven.MigrateMojo;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

import javax.servlet.ServletContextEvent;
import javax.servlet.ServletContextListener;

public class DatabaseMigratorListener implements ServletContextListener {

  private static final Logger log = LoggerFactory.getLogger(DatabaseMigratorListener.class);

  // Public constructor is required by servlet spec
  public DatabaseMigratorListener() {
  }

  // -------------------------------------------------------
  // ServletContextListener implementation
  // -------------------------------------------------------
  public void contextInitialized(ServletContextEvent sce) {
    String configFilePath = sce.getServletContext().getRealPath("/WEB-INF/classes/dbmigrator.properties");
    String migrationsPath = sce.getServletContext().getRealPath("/WEB-INF/classes/sql/db-migrations");

    try {
      MigrateMojo mojo = new MigrateMojo();

      mojo.setConfigFile(configFilePath);
      mojo.setMigrationsPath(migrationsPath);

      // we do not use environments like this, but we process the properties file in our build process with a maven profile
      mojo.setEnvironments("myenv");

      mojo.execute();
    } catch (Exception e) {
      log.error("Exception when executing migrator: ", e);
      System.exit(-1); // stop further execution of the webapp with a non zero error code
    }
  }

  public void contextDestroyed(ServletContextEvent sce) {
      /* This method is invoked when the Servlet Context
         (the Web application) is undeployed or
         Application Server shuts down.
      */
  }
}

 

Basically we do 2 things:

  1. We point to a dbmigrator.properties which provides properties for active-jdbc’s database migration properties
  2. We set up the directory where to find the SQL migration files (which is the same folder as we used in the plugin)

You can also see I set up an environment named myenv, in this case I use the same environment because I use maven profiles which interpolate environment specific variables for me. Another way would be to provide an environment variable so you can have 1 file but different property names.

Last but not least upon any exception we do a System.exit(1) to signify we should stop whatever we are doing.

Within the dbmigrations.properties file you will need at least credentials for the user which executes the migrations (I suggest you create a separate account for each environment for that), the file might look something like this:

myenv.driver=${jdbc.driverClassName}
myenv.username=${db.migrator.username}
myenv.password=${db.migrator.password}
myenv.url=${jdbc.url}

Here I also use serialized properties by maven, do note the variables all start with myenv which is the environment configured in the listener. The ${jdbc.driverClassName} and such are defined in my pom.xml.

Making execution optional

There are cases where you do not want to execute the migrations. For example you want to execute migrations automatically only on specific environments (ie, up till acceptance, but not yet on production). In that case

Add this in your web.xml

<!-- put this here somewhere on top of your web.xml file -->
<context-param>
  <param-name>executeDatabaseMigrationsOnBoot</param-name>
  <param-value>${db.migrator.execute}</param-value>
</context-param>

Again, the param-value is a serialized property. If you want to know more about those, see maven-war-plugin filtering options.

Either way you want to have the context-param so you can read out the value from your listener and do some logic with it.

I use a BoxingUtil class which is nothing but a simple helper method to make reading string to Boolean easier. The implementation of canParseBoolean is:

public static boolean canParseAsBoolean(String value) {
  if (Boolean.TRUE.toString().equalsIgnoreCase(value) ||
    Boolean.FALSE.toString().equalsIgnoreCase(value)) return true;
  return false;
}

and parseBoolean is exactly what it does, but it make sure the value is not null. It is not of much importance here.

To read the property and do some logic it becomes like this:

String value = sce.getServletContext().getInitParameter("executeDatabaseMigrationsOnBoot");
if (!BoxingUtil.canParseAsBoolean(value)) {
  log.error("Cannot parse value given for context-param (in web.xml) 'executeDatabaseMigrationsOnBoot', value is [" + value + "], but expected either [true] or [false]");
  System.exit(-1); // stop further execution of the webapp with a non zero error code
}

boolean executeMigrations = BoxingUtil.parseBoolean(value);

now you can execute the rest of the listener code when this value is true.

 

The listener, complete code example

For completeness sake, here is the listener fully, with a bit of logging and with the flag to execute or not.

package com.fundynamic.webapp.listener;

import com.fundynamic.util.BoxingUtil;
import org.javalite.db_migrator.maven.MigrateMojo;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

import javax.servlet.ServletContextEvent;
import javax.servlet.ServletContextListener;

public class DatabaseMigratorListener implements ServletContextListener {

  private static final Logger log = LoggerFactory.getLogger(DatabaseMigratorListener.class);

  // Public constructor is required by servlet spec
  public DatabaseMigratorListener() {
  }

  // -------------------------------------------------------
  // ServletContextListener implementation
  // -------------------------------------------------------
  public void contextInitialized(ServletContextEvent sce) {
    log.info("--------------------------------------------------------------");
    log.info("--------------------------------------------------------------");
    log.info("---              DATABASE MIGRATOR STARTING                 --");
    log.info("--------------------------------------------------------------");
    log.info("--------------------------------------------------------------");
    String value = sce.getServletContext().getInitParameter("executeDatabaseMigrationsOnBoot");
    if (!BoxingUtil.canParseAsBoolean(value)) {
      log.error("Cannot parse value given for context-param (in web.xml) 'executeDatabaseMigrationsOnBoot', value is [" + value + "], but expected either [true] or [false]");
      System.exit(-1); // stop further execution of the webapp with a non zero error code
    }

    boolean executeMigrations = BoxingUtil.parseBoolean(value);

    log.info("executeMigrations flag is [" + executeMigrations + "]");

    if (executeMigrations) {
      executeMigrations(sce);
    } else {
      log.info("Execute migrations was disabled, skipping executing migrations automatically!");
    }

    log.info("--------------------------------------------------------------");
    log.info("--------------------------------------------------------------");
    log.info("---              DATABASE MIGRATOR FINISHED                 --");
    log.info("--------------------------------------------------------------");
    log.info("--------------------------------------------------------------");
  }

  private void executeMigrations(ServletContextEvent sce) {
    String configFilePath = sce.getServletContext().getRealPath("/WEB-INF/classes/dbmigrator.properties");
    String migrationsPath = sce.getServletContext().getRealPath("/WEB-INF/classes/sql/db-migrations");

    try {
      MigrateMojo mojo = new MigrateMojo();

      mojo.setConfigFile(configFilePath);
      mojo.setMigrationsPath(migrationsPath);

      // we do not use environments like this, but we process the properties file in our build process with a maven profile
      mojo.setEnvironments("myenv");

      mojo.execute();
    } catch (Exception e) {
      log.error("Exception when executing migrator: ", e);
      System.exit(-1); // stop further execution of the webapp with a non zero error code
    }
  }

  public void contextDestroyed(ServletContextEvent sce) {
      /* This method is invoked when the Servlet Context
         (the Web application) is undeployed or
         Application Server shuts down.
      */
  }
}

 

What if I want to rollback?

So there are 2 scenarios for this.

  1. you’re releasing, and something went wrong.
  2. your release went fine, but you want to revert some changes in the next release

In scenario #1 roll back to the backup you created before executing the release and migrations.

In scenario #2 your release went fine and you should create a new migration rolling back the changes. There is no reverting of SQL scripts. just a continuation. (roll-forward)

Conclusion

Automating your SQL migrations is pretty straight-forward. If you write idempotent SQL scripts and configure your (web) app to execute them automatically with ActiveJDBC’s db-migrator you get convenience for free. With this you are a step closer to automating your deploy process which is a very powerful tool for both developers and non-developers.

You can either use maven profile‘s or ActiveJDBC’s environments to differentiate properties. What you use is up to you.

If you found this helpful or have any questions, let me know.

bookmark_borderMy experience with (un)certainty about estimates in relation to technical debt

Not too long ago, Martin Fowler pointed out a nice blog post by Jay Fields. Jay Fields refers to a nice talk he had about accidental complexity and essential complexity and how this has impact on your estimates. He found that not all developers consider the accidental complexity and therefor have lower estimates.

I found this a very interesting thought. It got me thinking how I estimate and how far I’m off. I found that, especially with larger solutions, I’m most of the time under estimating. Even with more complex things, and adding some ‘unforseen complexity percentage’. I’m still under estimating most of the time. However, I’ve also had better experience on other projects. Especially the latest project I’m working on the estimates of fixes and rework are not as much off. How is this possible?

I find myself labeling this phenomenom as “lack of overview”. If you read the definition of Accidental Complexity, it is described as “…accidental complexity is caused by the approach chosen to solve the problem.”. I believe this ‘approach chosen to solve the problem’ is the design of the code. This is different comparing to Essential Complexity, which I belief is much like Cyclomatic Complexity.

I made mistakes with my estimates, even when I knew the code well. Often it was due a dependency that ‘got in the way’, or worse, the lack of dependencies. All functionality was in one class! Adding similar behaviour required me to duplicate code. I consider this a bad practice, so I had to extract code from the other class. I was untangling the code. Whenever I had to untangle that code (ie, seperate concerns), I had a hard time doing so: Because untangling a tangled (tightly coupled) piece of code forced me to untangle other pieces of code as well. I had to stop somewhere. Like someone once said to me: The devil is in the details. (this is one of the reasons I encourage my co-developers to talk to interfaces, and not implementations).

But why are estimates off anyway? Is it because (lack) of experience with the code? Even with code I’ve worked with for years I still made bad estimates. And I could not find a way to get them better. The newer project went way better for me to estimate. I already knew why it was going better:

My mental model of the code matched better to the actual code. Was it because I worked on it lately and knew how it worked exactly in detail? No, not at all! The Technical Debt is much lower on this project. One of the principles that played a huge rule was the Single Responsibility Principle (PDF). When I had to make a change, it was often in one place. When I had to add code, I could easily move code out of the class and seperate responsibilities. The code was less tangled, tightly coupled.

This phenomenom of untangling code, seperating concerns and having a hard time maintaining code is clearly a sign of repaying serious interests of technical debt. And I clearly see that as a result of a ‘choosen approach to solve the problem’.

Therefor I believe the technical debt is linked to essential and accidential complexity and even more (what about readability?). Accidential Complexity is something that is very hard to grasp. I think this ‘uncertainty’ needs to be clarified and be added to each initial estimate in order to get a ‘more realistic’ estimate.

I would recommend to estimate code while looking at the code itself, rather than use just your mental model of the code.
Finally, repaying the interest of the technical debt should be prioritized in order to be able to maintain the system and to prevent to get a mad customer getting ever less features using ever increasing time to make them.

bookmark_borderA Software Quality Model (Part II) – Translating customer language into metrics, scoring quality

In my previous post I have explained the context of my thesis, and the various software quality models that are evaluated.

For my thesis I have extended the Software Quality Model of Willmer. Although it is not an exact implementation of the model, it is inspired by it. Also, the influence of the customer is processed into this model. The goal of this model is to translate customer desires into metrics, in order to calculate the total quality.

When talking about software quality, it is either very concrete (I want a red car!) or very abstract (it has to be reliable!). Customers tend to tell their ‘experiences of quality’ in sentences. This is the first step of the model. Try to get a few (eight at most) of these sentences. They must be distinctive. (Don’t have four sentences preach about security…)

Translating these sentences into concrete, measurable ‘things’ for developers, is another story. But before doing that , ask your customer what the relative importance is of all these sentences you just have written down. Just imagine that there is this situation where you have to go live. But , there is an issue that needs to be tackled, all blocking, one for each aspect (sentence). If you could pick one, which would you pick? Would you tackle the first sentence, the second? etc?

Of course, your customer will would tell  in his own domain language what the most important thing is. Try to map that (and confirm) with these sentences. Score them and try to get an ‘order of importance’. After you have done that, you have your first important goal reached: You know the relative importance of each quality sentence (aspect from now on).

So, what now? The next step is to map these aspects/sentences to Software Quality Attributes (also known as non-functional requirements). You either need a Software Engineer to do that, or even better, try to do it with your team. Before mapping these, try to make a selection of Quality Attributes first that are most relevant to you. Ie, try to use tree of Boehm or ISO9160 as reference. Within my theses I have used 9 Quality Attributes, some Quality Attributes are ‘sub-quality attributes’ of others. Example of Quality Attributes: Understandability, Reliability, Security, Availability, Complexity, etc.

The result of this mapping is that you get for each aspect several quality attributes. Not all quality attributes are applicable to an aspect. Try to figure out how much quality attributes are applicable to an aspect. Do this by asking teammembers and for each member that selects a quality attribute, score it. This way you can calculate relatively how your team thinks the mapping should be. This is important, because the eventual result of your measurement (see below) should be the product of the customer and the team that work on that product.

So, you have a few aspects, and each aspect has a few quality attributes. All that is left is to map metrics to quality attributes. Mapping this is fairly easy, there are quite a bunch of metrics out there. Each metric tries to measure (a piece of) a quality attribute. Some are easy, like complexity (quality attribute) can be measured by (although it is not limited to) the Cyclomatic Complexity metric by McCabe.

So basically you end up with this:

Aspect(n..8) –> Quality Attribute(n) –> Metrics(n)

Where:

  • the total quality of the system, is the combination of all aspects (all aspects relativity make 100%)
  • you should keep eight aspects (believe me, more will only make it harder to distinguish and make decisions)
  • you should attach quality attributes to each aspect, and determine their relative applicability to this aspect
  • you should attach metrics to quality attributes

So in the end, how do you score quality? Is it possible with this model? Yes, certainly it is.

Once you have found metrics, and attached them to quality attributes. You should formulate ‘scoring rules’. This means, you need to write down how you will interpret results of a metric and translate that onto a scale of 1 to 10. A scoring rule could be:

“Lines of Code (LOC) may be 1000 for a file max. Every 100 lines more substracts one point. (Giving 10 points with 1000 or lower and with 2000 lines a 1)”

This means, a LOC of 1000 will score a 10. A LOC of 1500 scores a 5, a LOC of 2000 or higher scores a 1.

Do this for all metrics, and eventually you will be able to calculate the total quality of the system.

In order to make this more concrete, here an example of such a calculation:

Total Quality Score = sum of scores of each aspect

Aspect score = (Sum of all relative scores of all applicable attributes) * relative importance

Attribute score = (Sum of all relative scores of all applicable metrics)

Example (For the sake of this example, attributes are scored already)

Aspect #1

Is for 30% important (relative to Aspect #2)

Attributes:

A -> for 40% applicable

B -> for 60% applicable

Aspect #2

Is for 70% important (relative to Aspect #1)

Attributes:

C -> for 70% applicable

B -> for 30% applicable

Scoring:

Attribute A scores 7 (out of 10)

Attribute B scores 5 (out of 10)

Attribute C scores 8 (out of 10)

Total quality calculation:

Aspect #1

A = 7 * 40% = 2,8

B = 5 * 60% = 3,0

Absolute score is 2,8 + 3,0 = 5,8

Aspect #2

C = 8 * 70% = 5,6

B = 5 * 30% = 1,5

Absolute score is 5,6 + 1,5 = 7,1

Total quality is:

Aspect #1 –> 5,8 * 30% (importance) = 1,74

Aspect #2 –> 7,1 * 70% (importance) = 4,97

Total quality score is 1,74 + 6,71 ==> 6,71

6,71 on a score of 1 to 10 (1 being bad, 10 being best).