The Builder Pattern

The Builder pattern is useful when we want to create objects with many properties to set and to ensure their immutability.
Let’s illustrate this concept. Lately, I have been cooking marinaded meat. But as you can imagine, it is far better to have all the ingredients mixed before marinading the meat.

We could simply use the constructor to do it, but it is lengthy, and possibly difficult to call if not all the ingredients are ready by the time we want to invoke the constructor.

public class Marinade {
  private Ingredient soySauce;
  private Ingredient garlic;
  private Ingredient gingembre;
  private Ingredient brownSugar;
  private Ingredient onion;

  public Marinade(Ingredient soySauce, Ingredient garlic, Ingredient gingembre, Ingredient brownSugar, Ingredient onion) {
    this.soySauce = soySauce;
    this.garlic = garlic;
    this.gingembre = gingembre;
    this.brownSugar = brownSugar;
    this.onion = onion;
  }
}

The other possbility is to use setters.

public class Marinade {
  private Ingredient soySauce;
  private Ingredient garlic;
  private Ingredient gingembre;
  private Ingredient brownSugar;
  private Ingredient onion;

  public Marinade() {
    this.soySauce = soySauce;
    this.garlic = garlic;
    this.gingembre = gingembre;
    this.brownSugar = brownSugar;
    this.onion = onion;
  }

  public void setSoySauce(Ingredient soySauce) {
    this.soySauce = soySauce;
  }

  public void setGarlic(Ingredient garlic) {
    this.garlic = garlic;
  }
  …
}

The problem with setters is that the properties are mutable. In our case, we do not want such a thing. Once our marinade is made, it is too late! Here comes the Builder Pattern.

public class Marinade {
  private final Ingredient soySauce;
  private final Ingredient garlic;
  private final Ingredient gingembre;
  private final Ingredient brownSugar;
  private final Ingredient onion;

  public Marinade(Builder builder) {
    this.soySauce = builder.soySauce;
    this.garlic = builder.garlic;
    …
  }

  public static class Builder() {
    private Ingredient soySauce;
    private Ingredient garlic;
    private Ingredient gingembre;
    private Ingredient brownSugar;
    private Ingredient onion;

    public Builder() {}

    public Marinade build() { 
     return new Marinade(this); 
    } 

   public void soySauce() { 
     this.soySauce = soySauce; 
   } 

   public void garlic() { 
     this.garlic = garlic; 
   } 
   ...
  }
}

In our case, the builder is an inner class. To build a marinade, we simply need:

Marinade marinade = new Marinade.Builder()
    .soySauce(soySauce)
    .garlic(garlic)
    …
    .build();

Simple and elegant… and the properties are immutable.

The Observer Pattern (or Publisher-Subscriber)

The “Gang of Four” defines it as “Define a one-to-many dependency between objects so that when one object changes state, all its dependents are notified and updated automatically” (1).

Why would I need such a thing? Let me illustrate the answer.

When I was a teenager, mobile phones were not yet available. My friends and I decided to spend our Saturday afternoon at the mall. We called each other to set up a time and place where to meet. When I got there, I saw no one. After waiting for a while, I decided to leave. Once at home, I called one of my friends who told me they had cancelled. Someone called someone else, who called someone else while the other one called another person, etc. Unfortunately, by the time I was supposed to get the message, I was already in the city bus.

It would have been great to have a Facebook group that we all joined, and get the cancellation message quickly. You see the problem was that the message somehow got lost between friends. Maybe some got busy with other things. Maybe others forgot to update me, or thought someone else had done it before, etc.

That’s where the Observer pattern is useful. My illustration is not a perfect fit though, but let’s say the person who cancelled was the leader of the group. He is the one in command, and he decides what to do. In the Observer pattern, he is usually called the “subject,” and each member of the group “observer.”

The observer decide to get notifications from the subject.

public class GroupLeader {
  public void attach(Observer obs) {
    // add to the list
  };

  public void detach(Observer obs) {
    // remove of the list
  }

  public void notifiy() {
    // go through the list and notify each observer of a change
    for each observer:
      observer.update("this is the new message")
  }
}
public class GroupMember {
  public void update(String message) {
    // do something
  }
}

The advantages are:

  1. Each member receives the same message.
  2. Each member do not need to worry about the others.
  3. Each member can be responsible for their own task

This pattern is a top-bottom behavior. There is only one who can send messages. A better design pattern for group messaging may be the Mediator Pattern where multiple objects can send messages to others. Which one should you choose? It really depends on what you are trying to accomplish.

(1) John Vlissides; Ralph Johnson; Erich Gamma; Richard Helm. Design Patterns: Elements of Reusable Object-Oriented Software. Published by Addison-Wesley Professional, 1994, p. 293

Constructors vs Static Factory Method

I have wondered if this topic was relevant. I have often seen both ways to create an object. Why bother choosing between a static factory method and a constructor? It seemed to me that they do more or less the same, and the advantages of the static factor method were not so obvious. Why not using the constructor to do what it was invented for: constructing objects?! The choice did and still does feel a bit artsy (=preference). But let me list a few pros for the static method, and you may change your mind.

Advantage #1: name it!

How the heck is this more advantageous? Well, clarity. You can name your static method in a way that means what it is doing. For example, if you want to build a new garden, you can write:

Salad salad = Salad.from(lettuce);

instead of

Salad garden = new Salad(lettuce);

Hmm?? Was it better? Ok, fine. Not that much. But if you need arguments to create your object, you may have factor methods with different names. It can be particularly useful when the parameters are of the same type. For example, I want to make a salad:

public Salad(Green lettuce, Green spinach) {
// do something
}

Now if I want to make a salad with a different mix of green assuming it requires a different behavior, it will look like:

public Salad(Green lettuce, Green roman) {
// do something else
}

The signature is the same. Of course, we could figure out something else. If we have static methods, it will look more elegant:

Salad salad = Garden.makeDarkGreenMix(lettuce, spinach);
Salad salad = Garden.makeLightGreenMix(lettuce, roman);

Well, this example was not very elegant…

Advantage #2: save resources!

A static factory method does not have to create an object. The common example is the getInstance() method Java developers often use to create Singletons.

public static SaladChef getInstance() {
  if (chef == null) { // chef = private SaladChef property
    chef = new SaladChef();
  }
  return chef;
}

Another example would be a pool of connections. etc. It might help to avoid using resources.

Advantage #3: return a subclass!

A static method can return a subclass, but not the constructor. It may help to manage the types of objects return depending on the parameters.

public static Macbook buy(double amount) {
  if (amount < 1000) {
    return new MackbookAir();
  } else {
    return new MacbookPro();
  }
}

But there are two cons: classes without public or protected constructors cannot be inherited, and static factory methods are harder to find than constructors.

Honestly, I still question the systematic use of static factory methods. It is a nice way to code, but it must be used with caution. And I think it is still a matter of art.

The Visitor Design Pattern

Design patterns are the fun of Object Oriented Programming. It is the artsy side of OOP! In this post, I am illustrating the visitor pattern to understand what it is and what it is meant to be.

When I moved in the US, I asked if doctors visited their patients. I was told it was only for emergency. But in France, it is a common practice. A doctor may have scheduled days to visits patients. His secretary would set up the appointments and let him know where and when.

In the Visitor pattern, the secretary is what the pattern calls “client,” or the dispatcher. She has the list (object structure) of patients (called “elements” which are the data objects) for the day, and calls the patients to make sure they “accept” the doctor (called “visitor” – perform the operation) when he comes for the visit.

The doctor performs a different operation depending on the patient. For example, Fred is a patient who has the flu, and Bob a broken wrist. When the doctor visits Fred, he will prescribe antiobotics, but for Bob, he will ask him to go to the clinic to get a cast.

In pseudo code, the visitor pattern looks like:

for (patient in the list) {
secretary.calls(patient.accept(doctor))
}

class patient {
  function accept(doctor) {
    doctor.performVisit(self)
  }
}

// sub classes
class patientWithFlu {
  function accept(doctor) {
    getMouthMask()
    doctor.performVisit(self)
  }
}
class patientWithBrokenBone {
  function accept(doctor) {
    getTylenol();
    doctor.performVisit(self)
  }
}

class doctor {
  function performVisit(patientWithFlu) {
    checkIfHasFever(patientWithFlu)
    prescribe(patientWithFlu, antiobotics)
  }

  function performVisit(patientWithBrokenBone) {
    checkBone(patientWithBrokenBone)
    prescribe(patientWithBrokenBone, painMedicine)
    putCast(patientWithBrokenBone)
  }
}

Obviously, one of the disavantages is the complexity it brings, but the advantage is the flexbility to change the operations without affecting the data objects. For example, let’s say the regulations change tomorrow and casts are required to be done in clinics. Our traveling doctor cannot do it anymore during his visit. The doctor class is the only one that needs to be changed. The patientWithBrokenBone stay the same.

function performVisit(patientWithBrokenBone) {
  checkBone(patientWithBrokenBone)
  prescribe(patientWithBrokenBone, painMedicine)
  giveContactInfo(patientWithBrokenBone, clinicA)
}

In Summary:
Client
-> calls each element of a list to accept the visitor

Element (each element requires a different type of operation)
-> accepts the visitor and let him do what he is supposed to do

Visitor
-> performs the operation with the element

Elasticsearch, Logstash and Kibana (ELK) v7.5 … running on Docker Swarm

Docker is a great concept. The idea is to create containers that we can simply download and run on servers without going through the process of installation again and again… But (you were expecting it!) it adds its own complexity. We need to learn how Docker works – which seems to be endless madness. There is quite a bit of commands to learn and configurations to do. And lastly, we have to hope that the software has somme sort of support for Docker.

Creating Docker images

I created customized images of ELK 7.5 to integrate the configuration and also by necessity for Logstash to update the RabbitMQ output plugin. For each component, we need a Dockerfile with its configuration.

Elasticsearch

FROM docker.elastic.co/elasticsearch/elasticsearch:7.5.0

# Change ownership of the data files
USER root
RUN chown -R elasticsearch:elasticsearch /usr/share/elasticsearch/data
USER elasticsearch

# copy configuration files from local to container
COPY config/elasticsearch.yml /usr/share/elasticsearch/config/elasticsearch.yml

Kibana

FROM docker.elastic.co/kibana/kibana:7.5.0

# Add your kibana plugins setup here
# Example: RUN kibana-plugin install 
COPY config /usr/share/kibana/config

Logstash

FROM docker.elastic.co/logstash/logstash:7.5.0

# Copy configuration files
COPY config /usr/share/logstash/config
COPY pipelines /usr/share/logstash/pipelines

# Change ownership of the data files
USER root
RUN chown -R logstash:logstash /usr/share/logstash/data
USER logstash

# needed to upgrade rabbitmq packaged in the orginal image as 7.0.0 but does not work
RUN ["logstash-plugin", "update", "logstash-integration-rabbitmq"]

Elasticsearch Configuration

Configuration is where the difficulties starts. I won’t detail Kibana’s since it is pretty straightforward, but Elasticsearch and Logstash need more explanation. In my case, I am deploying the containers as a Docker Swarm which implies clustering. The problem with Elasticsearch is that it has its own clustering system. It is not a simple replication of one container.

For Elasticsearch/Kibana (since Kibana stores its data as Elasticsearch indices) to work with multi-nodes, you need to configure the following parameters. ES revamped the way clustering works in 7.X.X. I have tried to configure ES with one service replicated by Docker Swarm, but it just does not work at this point. The problem resides in the way ES manages the list of masters and the host names.

My configuration looked like:

discovery.seed_hosts=elasticsearch
# list of the master nodes
cluster.initial_master_nodes=my-master-node-1,my-master-node-2,etc. 
node.name={{.Node.Hostname}}

I even tried to update the node name with my-es-node.{{.Task.Slot}} based on https://github.com/deviantony/docker-elk/issues/410, but the problem is that ES automatically generates a string added after the node name. The only solution was to create multiple ES configurations for each node.

Warning: this configuration does not guarantee that an instance of the Elasticsearch container will run on your master node(s). That is why we decided to duplicate the elasticsearch configuration for each node though it diminishes the benefits of using Docker Swarm. Ex:

elasticsearch-01:
  image: localhost:5000/elasticsearch-logging:1.0
  deploy:
    ...
  volumes:
    ...
  environment:
    ...
    - discovery.seed_hosts=elasticsearch-02,elasticsearch-03
    - cluster.initial_master_nodes=elasticsearch-01,elasticsearch-02,elasticsearch-03
  elasticsearch-02:
    image: localhost:5000/elasticsearch-logging:1.0
    deploy:
      ...
    volumes:
      ...
    environment:
      ...
      - discovery.seed_hosts=elasticsearch-01,elasticsearch-03
      - cluster.initial_master_nodes=elasticsearch-01,elasticsearch-02,elasticsearch-03
    elasticsearch-XX
      ...

This solution is not finished yet. The configuration is duplicated, but worse, we would lose the “load balancing” benefit from using Docker Swarm. For example, you would need to configure Kibana and Logstash to point to one of the nodes. That is why I needed to create an alias for all the ES masters so that they can be addressed by Kibana and Logstash as “elasticsearch” instead of elasticsearch-01 or 02:

networks:
   my-network:
     aliases:
       - elasticsearch

Elasticsearch 7 also requires distinct data folders for each node. In your elasticsearch.yml, add the following configuration:

volumes:
  # data path must be unique to each node: https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-node.html#max-local-storage-nodes
  # folder must be created on each host before starting the swarm: https://docs.docker.com/engine/swarm/services/#data-volumes
  - /my/node/data/folder:/usr/share/elasticsearch/data:rw

Logstash Configuration

Logstash requires volume configuration for persistence and networks if you need to reach a container on an external network. In my case, logstash is used to send data to a RabbitMQ exchange.

  volumes:
    - /my/logstash/data:/usr/share/logstash/data:rw
    - /my/logstash/pipelines:/usr/share/logstash/pipelines:ro
  networks:
    - my-logstash-net
    - rabbitmq-net

networks:
  my-logstash-net:
    external: true # depends on your configuration
  rabbitmq-net:
    external: true

Now you need to configure the pipelines which allows you to connect an input with an output. One useful setting is the auto-reload of the pipeline configuration:

config.reload.automatic: true # to add in logstash.yml

In this case, I am using Elasticsearch as an input to send data to a RabbitMQ exchange. First, we need to define the pipeline in pipelines.yml:

- pipeline.id: my-pipe-id
  path.config: /usr/share/logstash/pipelines/my-pipe.conf # define where the configuration file is

I created my-pipe.conf with the following configuration. It is worth noting the need for scheduling the elasticsearch input as well as the index to use.

# Configuration to send the error logs to notification modules
input {
  elasticsearch {
    hosts => ['elasticsearch:9200']
    schedule => "*/15 * * * *" # every 15 minutes. If not scheduled, the pipeline does not work. The logs said connection to RabbitMQ "opened" and then "closed" a couple of second after
    query => '{
	    "query" :  {
		    "bool": {
			    "must": {
			      "regexp": {"log": ".*exception.*|.*error.*"} # search everything with exception or error
			    },
			    "filter": {
				    "range": {"@timestamp": {"gte": "now-15m"} } # only search the data logged in the past 15 minutes since the pipeline is triggered every 15 min.
			    }
		    }
	    }
    }'
    index => "my-custom-index-*" # !! by default, the index is not *, but logstash-* which is an issue if you configured Elasticsearch indices with a different prefix!
  }
}

filter {
  mutate {
    add_field => {
      "log" => "Added by the filter: %{message}" # update as needed
    }
  }
}

output {
  rabbitmq {
    id => "my_id"
    host => "rabbitmq"
    port => "5672"
    user => "my_rabbitmq_user"
    password => "my_rabbitmq_pwd" # Unfortunately, using a file or docker secret is not possible for now...
    connect_retry_interval => 10
    exchange => "my-exchange-name"
    exchange_type => topic  # if topic is your queue type
    codec => json
    key => "my-routing-key"
  }
}

In the case of a file as input, there is no schedule, but this configuration is necessary to feed the pipeline:

start_position => "beginning"

Then build the Docker images and push them to the registry with the following commands:

docker build -t my-tag
docker push my-repo-url

Deployment on Docker Swarm

When deploying on the Swarm, there are a few differences to account for. First, you need to create the volume folders for elasticsearch and logstash. Docker stack does not automatically create the missing folder contrary to docker-compose. Seond, change permission on the folders on the mapped folders to allow elasticsearch and logstash users to be able to write data, i.e chmod 777 /mnt/my/data/folder (not /usr/share/… in the containers)

Then deploy with:

docker stack deploy -c my-docker-stack.yml my-cluster-name

If you see Kibana messing up in the logs or if you are not able to create and retrieve an index pattern, then you messed up the Elasticsearch clustering configuration. Here is an example of errors I saw in the logs

Error: [resource_already_exists_exception] index [.kibana_1/ZeFEi9gkSESfQYSOkZwoFA] already exists, with { index_uuid=\"ZeFEi9gkSESfQYSOkZwoFA\" & index=\".kibana_1\" }"}

Good Luck!! Hours and hours can be spent on Docker…

Mockk framework: Could not initialize class io.mockk.impl.JvmMockKGateway

If you are looking for a mock framework for Kotlin tests, Mockk is definitely on first on the list. I wrote my first test following a simple example on the official website (https://mockk.io/#dsl-examples). In my case, I am testing an export class using a parser to deserialize a file into an object:

@Test
fun `test export excluding sub-folders`() {
    // given
    val path = "my/new/path"
    val parser = mockk()
    every { parser.buildObject("$path/data.xml") } returns MyObject("first name", "last name", "title", "manager")

    // when
    var myObject = exporter.parse(
            parser,
            getResourceFile(path),
            mutableListOf()
        )

    // then
    verify { parser.buildObject("$path/data.xml") }
    confirmVerified(parser) // verifies all the verify calls were made
    assertEquals("first name", myObject.firstName)
    assertEquals("last name", myObject.lastName)
    assertEquals("title", myObject.title)
    assertEquals("manager", myObject.manager)

}

Wunderbar! I ran my test… and it failed:

java.lang.NoClassDefFoundError: Could not initialize class io.mockk.impl.JvmMockKGateway

What in the world is that?! I checked my pom file and verified I had the latest version:

<dependency>
  <groupId>io.mockk</groupId>
  <artifactId>mockk</artifactId>
  <version>1.9.3</version>
  <scope>test</scope>
</dependency>

After a while, I figured out the issue resides in the IDE (https://github.com/mockk/mockk/issues/254). Mockk has issue when running on JDK 11. I went to Intellij IDEA -> File -> Project Structure, and changed the Project SDK to JDK 8 (which I use for my current project). You can probably find another version that would work for your own project.

Defying Windows’ 260 character paths!

Windows is known for limiting paths to 260 characters which can be an issues when you deal with java packages. You cannot write, rename or delete files once the limit is reached. However there are a few tricks that can be helpful to go around this issue if you cannot shorten your paths.

Since Windows 10 Build 14352 and later, it is now possible to enable long paths… but it only works with Powershell (!!) Windows Explorer and third party applications do not work. (Please do not tell me this is weird…)

  1. Enable Long path on windows
    1.1 Go to start menu and open Group Policy in Control Panel by searching for “gpedit”

1.2 Go to Local Computer Policy → Administrative Templates → System → Filesystem and edit “Enable Win32 long paths”

1.3 Enable this option

Apply policy change by running this command in a terminal:

gpupdate /target:computer /force

If you are using Git to clone a repository that contains long paths, Git allows you to add a setting to handle them. For that, open a new terminal as admin (I repeat “as admin”), and run:

git config --system core.longpaths true

Hopefully, this helps… until Windows find something more coherent. Why would you disable long paths by default? Unless it is because only Powershell can use it 🙂