Part one

Spring Boot tutorial: Microservices architecture in PCF and Kubernetes

Michael Gruczel
© Shutterstock /  Unconventional

Moving from the monolith to microservices has a lot of advantages. In part one of this tutorial, Michael Gruczel starts his step-by-step tutorial for developers who want to implement microservices architecture in Kubernetes and Pivotal Cloud Foundry with Spring Boot.

Breaking a monolith into microservices has a lot of advantages, like faster build times, faster independent deployments and the possibility to build domain driven teams. This is the organizational foundation for a holistic microservice approach and DevOps. With the movement from a huge single monolith to hundreds or thousands of different apps, complexity moves from the management of code-dependencies to the management of complex orchestrations.

In this article, I will explain some of the fundamental principles behind microservices and show some possibilities to implement them. By using simple examples, I will show how this could look like in reality in Kubernetes and Pivotal Cloud Foundry. I will not explain how to operate PCF and Kubernetes, instead I will focus on the application part of it.

All code examples can be found at GitHub. If I refer to a folder named source/chat-app in an example, I refer to this. I assume you have cloned or downloaded the content of the Github repo, if you want to execute the examples yourself in Kubernetes or PCF.

Don’t miss the second part of this tutorial

Kubernetes and Pivotal Cloud Foundry

I decided to use Kubernetes and PCF for the demonstration, because I wanted to showcase the concepts with a PAAS and a CAAS solution. I believe companies should select a PAAS or CAAS solution over a self-made IAAS solution in most cases. Nonetheless, the same principles and frameworks can be applied on IAAS or bare metal, as you will see. This article is written in a way that you can execute the examples on Kubernetes or PCF step by step. Even if you only want to execute the examples on one of the platforms, I recommend to read both descriptions.

Cloud Foundry is one of the leading open source PAAS solution, together with Open Shift. I use the paid version of Pivotal for my showcases because I need to write less code myself to showcase the examples. If you want to execute my examples yourself in PCF, then I would recommend to sign up at Pivotal to get a temporary free account with a free credit.

SEE MORE: Kubernetes 1.7: Good news for those running scale-out databases on Kubernetes

Kubernetes is one of the most famous CAAS solutions and one of the most complete ones. If you want to execute my examples in Kubernetes, I would recommend to sign up for a Google Cloud Container Engine Offering account with some free credits.

Both platforms will give you a free credit big enough to execute all examples shown here. Please take sure that you delete all your created services yourself after it. I am not responsible for possible costs. Optionally you can run booth offerings in a reduced setup locally or install the solution somewhere else, but that’s not part of this article.

Explicitly declare and isolate dependencies

Some of the most important principles for creating a well-defined microservice architectures can be derived from the twelve factor app manifest. One factor demands that microservices declare and isolate dependencies. A twelve-factor app never relies on implicit existence of system-wide packages. This principle should be applied to non-microservice architectures as well. If you have just one type of software and one deployment of one monolith, you could possibly manage the dependencies for this single service externally.

However, with a large number of services, it’s just not possible to manage that manually anymore. It no longer makes sense to deploy your application into an application server. These servers often need a huge amount of resources, maybe much more than microservices.

SEE MORE: Spring Boot tutorial: REST services and microservices

The promise to build an app once and then run it in different application servers without adaption never really worked out. Often, it needed a lot of changes to port one app from one application server to another or just to another version. It was more painful than useful to have runtime dependencies in the application server classpath itself, ignoring the libraries in your application. The executed classpath in your dev stage often differed from the production one, usually because the libraries on classpath were different.

Because of reasons (whether licensing, performance, or maintenance), the application server used in production often differed. This meant that developers had to wait several minutes to get their code into the appropriate application server for local testing with every software change.

Now, the paradigm has changed. Space is cheap, but development time is expensive. Paying expensive developers and making them wait for the deployment to an application server to test a code change shouldn’t be acceptable any more. Fast start up times, fast feedback, and reproducible results are more important than reducing memory footprint or disk space.

Today, an application should include as much as possible of his dependencies itself. Every dependency should be declared explicitly. “Runs on my machine” is a sentence you don’t want to hear. Spring Boot supports this idea, up to a certain level. All dependencies can be managed by a pom file (maven) or a gradle file.

SEE MORE: Microservices: Balancing flexibility and complexity in your project

More importantly, Spring Boot applications are self running jars. They include an embedded tomcat in your application, although other options are available. The libraries are made for fit to a servlet container and don’t need a full JEE profile any more. It will run in production with the same libraries like on your local workstation. Startup times are in the range of seconds and not minutes or more anymore. There is only one dependency which is not bundled and not explicitly defined. That is the Java runtime itself.

We will see how this is solved in PCF and Docker. You will find the source for a simple example in source/updatedemo. This is a gradle app which will run in an embedded tomcat. It will respond to a requests like http://localhost:8080/version?name=mike. After creating the app skeleton using Spring, I just had to add some lines of code to make a working rest service out of it (see Listing 1).

Listing 1

public class VersionInfoController {

    private static final String template = "Hello, %s! this is version 0.0.1";
    public String version(@RequestParam(value="name", defaultValue="unknown user") String name) {
        return String.format(template, name);

The magic happens with one single annotation @SpringBootApplication on the main class.

In Listing 2, a tomcat will be started and all interfaces in the controllers will be exported.

Listing 2

package mgruc.article;

import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;

public class UpdatedemoApplication {

  public static void main(String[] args) {, args);

The result of the build process is a jar which includes all dependencies.

Now, let’s build and start the example in Listing 3. (The code can be found in source/updatedemo.)

Listing 3

./gradlew build
java -jar build/libs/updatedemo-0.0.1.jar
# open http://localhost:8080/version 

Explicitly declare and isolate dependencies in PCF

Let’s deploy the jar to PCF. First we have to install a client to connect against a PCF installation. You could build the jar yourself or just use the one I’ve built.

Listing 4

# go to folder artefacts of my repo
# create account at
cf login -a
cf push updatedemo -p updatedemo-0.0.1.jar --random-route

Cloud foundry will automatically put your jar into a container and distribute it on a virtual machine. It will create internal DNS/routing as well. I selected random-route in order to ensure that I will not run in conflicts. It will create an entry named -..

The app is automatically bundled with so called buildpacks. Buildpacks can contain complex application servers or just a java installation. There are publicly available as pre-installed buildpacks. Every buildpack has a detection to check whether the buildpack fits. You have the option to define which buildpack should be used or trust that the right buildpack from the PCF installation will be detected, like in our example.

SEE MORE: The new face of Cloud Foundry

In our case, a slim Java buildpack will be selected automatically. Of course, you can write your own buildpacks, but that’s not the topic of this article. You should be able to open it now. You can retrieve the app info (and the random domain) by cf app updatedemo. The full url will be In case the url for the app would be, then you could get the version by calling curl

Explicitly declare and isolate dependencies in Kubernetes

Kubernetes uses Docker containers as deployment artefact. I won’t go into detail about the basics of Docker. Essentially, a Docker container is a minimalistic operating system orchestrated with only the libraries you really need for your application.

Our first example container will have a compressed size o f about 88 MB including OS and a java app. A container can be built by a description and stored in a Docker registry. A Docker container normally only contains one running process and it is isolated from the other containers running on the same host. Of course, that’s a gross simplification, but it’s good enough to continue for this examples.

By using the isolation of containers and defining for each container the full stack including OS, java and all packages, the dependencies can be well defined and controlled. A description of a Docker container could look like Listing 5.

Listing 5

FROM frolvlad/alpine-oraclejdk8:slim
ADD updatedemo-0.0.1.jar app.jar
RUN sh -c 'touch /app.jar'
ENTRYPOINT [ "sh", "-c", "java -jar /app.jar" ]

In this example I am using the jar which we build before (see artefacts/Dockerfile for the Dockerfile example). If a Docker daemon is installed you could run it by some commands, like it Listing 6.

Listing 6

$ docker build -t updatedemo .
$ docker run -d -p 8080:8080 updatedemo
# open http://localhost:8080/version 

You should be able to call it ‘curl http://localhost:8080/version?name=Mike’. If you have created an account at Docker hub, then you could upload the container to the public Docker registry for example like in Listing 7.

Listing 7

$ docker images
$ docker tag <id of the image> <your docker id>/updatedemo:v1
$ docker login
$ docker push <your username>/updatedemo:v1 

In a real scenario you will probably set up you own private registry. You can download and run a Docker container from an existing public or private registry. For example, if you want to use my test image (stored in public registry) you can execute ‘docker run -d -p 8080:8080 mgruc/updatedemo:v1’. But we want to start a container in the google cloud.

Kubernetes has a command line client (kubectl) to execute commands against a running Kubernetes installation. Since we are using Google, I will use the Google Cloud Shell (see here and here for some examples). This console already includes an installation of the Kubernetes client, so we can skip the installation for this demo by using the console.

Let’s start our simple spring boot app. Open the google cloud shell and execute the commands from Listing 8.

Listing 8

gcloud config set compute/zone us-central1-b
gcloud config list
gcloud container clusters create example-cluster
gcloud auth application-default login 

We will create a deployment description like in Listing 9 to start our app (updatedemo-v1.yaml, see and deploy it to Kubernetes in Listing 10.

Listing 9

apiVersion: extensions/v1beta1
    kind: Deployment
      name: updatedemo-deployment
      replicas: 1 # just 1 pod
            app: updatedemo
          - name: updatedemo
            image: mgruc/updatedemo:v1
            - containerPort: 8080 

Now we deploy it to Kubernetes.

Listing 10

kubectl create -f updatedemo-v1.yaml
kubectl get deployment
kubectl describe deployment updatedemo-deployment
kubectl expose deployment updatedemo-deployment --type="LoadBalancer"
kubectl get pods -l app=updatedemo
kubectl get service updatedemo-deployment 

Wait until you get an external IP and check the content ‘curl http://:8080/version?name=Mike’.

Scale out via the process model

This principle of a twelve-factor app manifest is one of the most essential requirements. There was a time when hardware was getting faster every year, even faster than the increase of demand. When our applications were too slow, we just deployed them on bigger machines. That was waste from the very beginning. We kept scaling up the entire monolithic app, even though only some of the processes within the app needed more resources and not the whole thing.

That time is over. Hardware does not scale any more to the speed needed to service a globalized world. Highly resilient and performant systems for the demand of globalizing world can achieved only by horizontal scaling out. Instead of buying faster machines for a running process, we scale by running several instances of the same process in parallel. That means software must be designed in a way that you can start hundreds of instances of the same software at the same time if needed. Our underlying infrastructure need to support this.

Scale out via the process model in PCF

Scaling in PCF is simple. If you want to run several instances with the same software, you can just tell PCF to do this. It uses the same source code and the same buildpack and start several containerized instances of it distributed on several machines.

PCF has an internal routing system, which will make all instance available under the same url and it will balance traffic by default equally (round robin) between the instances. Horizontal scaling of our example app to 3 instances in PCF and back to 1 instance can be done like in listing 11.

Listing 11

cf scale updatedemo -i 3
cf scale updatedemo -i 1

Scale out via the process model in Kubernetes

Kubernetes offers horizontal scaling in a comparable way to PCF. It uses the same Docker container and starts several instances distributed on several machines. It also has an internal DNS and routing system. Horizontal scaling of instances in Kubernetes can be done by a replication controller.

For example, you can scale our example app to 3 instances by executing the commands in Listing 12.

Listing 12

kubectl scale --replicas=3 deployment/updatedemo-deployment
kubectl get pods -l app=updatedemo 

Another option is to stick to the deployment file concept, we used in the first example. That means we will create a deployment description to start 4 instances of our app (updatedemo-v1-4-replicas.yaml) instead of one.

Listing 13

apiVersion: extensions/v1beta1
kind: Deployment
  name: updatedemo-deployment
  replicas: 4 # 4 pods
        app: updatedemo
      - name: updatedemo
        image: mgruc/updatedemo:v1
        - containerPort: 8080 

You can just upload the file from here if you want to execute this example.

Listing 14 shows the commands to execute.

Listing 14

kubectl apply -f updatedemo-v1-4-replicas.yaml
kubectl get pods -l app=updatedemo 

Kubernetes will add 3 more so called pods. We’ll take a look later into the auto scaling possibilities for Kubernetes. A Pod is a group of one or more container, all applications in one pod instance are co-located, co-scheduled and have a shared context. When we start 4 pods, it means that the containers are distributed and not co-located.

Smooth app updates

The two reasons to split monoliths into microservices are to gain development speed and to increase release frequency. An update of a service must be executable within seconds and without any downtime. That has some implication to the software design.

For example, since the update of a database might not be executed at exactly the same time and speed like the application update, your application must designed and developed in a way that a newer version can run with the old database version or the old application version must be able to work with the new version of the database. That’s normally easy to implement. Your infrastructure must support the switch to a new version in a seamless fashion as well.

Let’s us use a simple example again. The apps updatedemo-0.0.1.jar and updatedemo-0.0.2.jar in the artefacts folder of my repo are nearly equal, apart from the different version. We will execute an update of version 0.0.1 at runtime in PCF and Kubernetes to see how the systems are handling it.

Smooth app updates in PCF

Cloud foundry normally stops all containers and start new ones after all containers are stopped. The reasons is that some applications maybe cannot run in parallel in 2 different versions. Of course, being offline is not acceptable for a real resilient microservice architecture. The easiest solution is to use a blue green deployment. That means, that there are 2 stages (blue and green).

We will deploy the new version on a stage which is not attached to the loadbalancer/routing. As soon as the new Version is deployed, the traffic will be balanced from the old stage to the new one. An easy way to do this, is to install a plugin for it. In order to do the deployment, we need to create a manifest file which describes the deployment, manifest.yml.

Listing 15

- name: updatedemo
  memory: 1024MB
  random-route: true
  path: updatedemo-0.0.2.jar

Listing 16 shows how to install the plugin, how to execute the blue-green deployment, and how to delete the containers in the old stage after the deployment. PCF will create new containers with the new version of our app, and then route the traffic to the new versions. The old ones are still available for the case you have to route back. In our case, we will delete them.

Listing 16

cf add-plugin-repo CF-Community
cf install-plugin blue-green-deploy -r CF-Community
cf blue-green-deploy updatedemo
cf delete updatedemo-old 

The plugin just executes PCF commands. However, the plugin isn’t necessary. Honestly, you have a better control over the deployment by doing it without the plugin.

Smooth app updates in Kubernetes

We can deploy our new version of the updatedemo app, by defining a deployment description (Listing 17, updatedemo-v2-4-replicas.yaml). After executing the commands shown in Listing 18 Kubernetes will execute a canary deployment (rolling update, see Image 1). That means both versions of the application will run at the same time.

Kubernetes will only replace a some of the containers one at a time until all containers are replaced with the new version. So, the traffic will migrate continuously from the old to the new version. Kubernetes allows you to define a strategy including for example the maximal number of containers with the new version which are deployed in parallel at the same time or to define a so called readiness probes, means a test to determine whether a container is healthy like for example a GET on certain url which has to return a 200 status code.

Without defining an update strategy Kubernetes will apply the default rules. In our case he keeps at least 3 containers of the 4 up and running and creates up to 2 new containers at a time. You should now be able to get the new version ‘curl http://:8080/version?name=Mike’.

Listing 17

apiVersion: extensions/v1beta1
kind: Deployment
  name: updatedemo-deployment
  replicas: 4 # 4 pods
        app: updatedemo
      - name: updatedemo
        <b>image: mgruc/updatedemo:v2</b>
        - containerPort: 8080 

Listing 18

kubectl apply -f updatedemo-v2-4-replicas.yaml
kubectl get pods -l app=updatedemo
kubectl rollout history deployment/updatedemo-deployment 



Image 1: Rolling update

Auto failover

Getting an alert if a machine fails or an application is not reachable any more is important, but if you have thousands of different services, then you want to have a systems which is able to repair itself. That means a fail of a machine or a crash of an application should be handled automatically without any service interruption for the user. PCF and Kubernetes are doing this for you. If a machine fails, PCF and Kubernetes will start the application on a different host. If the application container crashes, new instances will be spawn.

Auto scaling

Uusally, you want to scale you application horizontally depending on the load. Maybe you have the luxury to pay some operational personal to monitor your monolith, but even then you run into problems. First, you have to scale the full monolith, even if one process within the app is consuming the resources. Secondly, you have to be able to detect the problems and then you have to be able to spawn new instances within minutes.

If we have do this with hundreds of apps, we need an automated solution. As soon as the traffic increases or heavy process are running, we want it to automatically start new instances. As soon as this situation is over, we want to reduce the instances in order to save some money. Luckily PCF and Kubernetes offer basic application auto scale options.

Auto scaling in PCF

PCF offers an application autoscaling as a service. You can configure the autoscaling behavior in an UI based on some pre-defined metrics (CPU Utilization, HTTP Latency or HTTP Throughput) and scheduling (see here for details).
That means you could for example define, that on a weekend an application should be scaled up to 10 instances if the CPU load per instance is over a certain value.

Listing 19 shows how to bind our updatedemo app to an autoscaler and image 2 shows the autoscaling service UI.

Listing 19

cf create-service app-autoscaler standard autoscaler
cf bind-service updatedemo autoscaler
cf restart updatedemo 

PCF autoscaling UI

Auto scaling in Kubernetes

In Kubernetes, you can autoscale based on CPU utilization as well. In our example, we could say that the scaling should be between 1 and 3 instances. The scaling should take place if CPU utilization is over 60% by executing. Listing 20 should do the job.

Listing 20

kubectl autoscale deployment updatedemo-deployment --min=1 --max=3 --cpu-percent=60
kubectl get hpa
kubectl describe hpa updatedemo-deployment

You can define custom metrics provided by the application if you want. Details about the algorithms and about how to create custom criteria can be found here.

Before we continue with the tutorial, I recommend cleaning up a little bit by executing listing 21.

Listing 21

kubectl delete service updatedemo-deployment
kubectl delete deployment updatedemo-deployment
kubectl delete hpa updatedemo-deployment 

Service Discovery

If services want to communicate with each other in a rest based manner and if you want to scale all services dynamically and fully automated, you have two options.

  1. A combination of automated DNS and loadbalancing: Every time you deploy another instance of your service it should be reachable under the same address like the other instances. The traffic should be balanced between the services equally. In order to reduce maintenance cost and increase flexibility, this must be automated. Often this needs time to implement. A lot of loadbalancers offer only reduced automation options and you maybe need a solution to refresh your DNS caching. That’s why PCF and Kubernetes have their own fully automated solutions for this.
  2. Service discovery: Instead of using a loadbalancer, you could do the loadbalancing on the client itself. That means all services need to know the IP of all other services they want to speak with. A simple solution to implement this is a service discovery. That means every time a new instance of a service is deployed, the service will register itself at a discovery service. Every time a service is stopped, the service will deregister itself.In case of possible crashes, additional timeouts or heartbeats are needed. If a service wants to speak with other services instances of a certain type, the service will ask the discovery services for all known instances and the service will do the loadbalancing itself. This seems to be more complex, but has several advantages. For example, you can easily deploy a different software version with the same interface under the same service definition/tag in order to do rolling updates or A/B-tests. Apart from that, you can also move another part of your infrastructure into your code. Famous options to do this are Consul or Eureka (from Netflix OSS).

For this demo I will use 2 applications:

  • Weather service returns weather data for a town
  • Concert service returns concerts which are available in a town, this service retrieves additional weather data for that town from the weather service.

I want to scale the instances and it should still work if machines are crashing and instances need to be re-started on different hosts, so we have to empower the apps to find each other without changing any configurations. That’s called service discovery. There are 2 libraries which fit perfectly into the spring boot world.


Eureka is an open source service discovery solution from Netflix open source software center. You can create a server which acts as service discovery by annotating your application with @EnableEurekaServer. In order to connect your application against an eureka service you need the dependency and one line of configuration.

Listing 22 is an example for a server and listing 23 for a client.

Listing 22

public class CfEurekaServerApplication {

    public static void main(String[] args) {, args);

Listing 23

//@EnableEurekaClient would work as well but @EnableDiscoveryClient is more abstract
public class Application {

    public static void main(String[] args) {, args);

In order to connect against a server which is registered at Eureka, you can just use the name of the service instead of the IP or the correct url, since the discovery client will retrieve the concrete IPs. Instead of using a real url (for example, http://random-route-abcde-weather-app) we will use http://weather-app. The Discovery client will use the services which are registered under the name weather-app. In this example, the concert-app will ask for the weather to add this information additionally to the concert information.


In this code sample, we already can see the second important library, Ribbon. Ribbon is client-side loadbalancing and can be enabled on a RestTemplate for example by the @LoadBalanced annotation (see listing 24). That means the service will balance requests between all known instances.

If you want to try it locally, you can use my samples. Just checkout the code in /source. Start Eureka, the weather-app and the concert-app by executing ‘./gradlew bootRun’ The Eureka UI will be available under http://localhost:8761/, the weather service under http://localhost:8090/weather?place=springfield, and the concerts service under http://localhost:8100/concerts?place=springfield. The weather-app and the concerts-app will connect to Eureka and the concert-app will connect to the weather app by retrieving the information from Eureka.

Listing 24

public class ConcertInfoController {

    RestTemplate restTemplate(){
        return new RestTemplate();

    RestTemplate restTemplate;

    public ConcertInfo concerts(
         @RequestParam(value="place", defaultValue="") String place) {
        // retrieve weather data
        // now retrieve weather data in order to add it to the concert infos 
        Weather weather = restTemplate.getForObject(
                     "http://<b>weather-app</b>/weather?place=" + place, Weather.class);

Service Discovery in PCF

PCF supports domain based routing and service discovery. That means if I push a software without using random routes, the application will be available under a certain url. If I start more instances, PCF will automatically balance traffic between the instances. We don’t need to use any service discovery and we don’t need to implement DNS and loadbalancing ourselves. If we would push and scale our demo without a random route, then it would be reachable under ‘’. That solution is probably good enough for 90% of all use cases.

However, there might be reasons why you want to use service discovery even if this is not needed in PCF. For example, you may want to use random routes, which means you do not know the routes upfront. In this case, a combination of Eureka and Ribbon is very effective. This solution works outside of PCF as well (in your IAAS solution for example). We could use Eureka in Cloud Foundry, but PCF offers this already as embedded service, so I will use it in Listing 25.

Listing 25

 # folder artefacts    
 cf create-service p-service-registry standard mgruc-service-registry
 cf push mgruc-pcf-weather-app -p weather-app-0.0.1.jar --random-route --no-start
 cf push mgruc-pcf-concert-app -p concert-app-0.0.1.jar --random-route --no-start
 cf bind-service mgruc-pcf-weather-app mgruc-service-registry
 cf bind-service mgruc-pcf-concert-app mgruc-service-registry
 cf start mgruc-pcf-weather-app
 cf start mgruc-pcf-concert-app
 cf app mgruc-pcf-concert-app 

If you now open the concert app, it will automatically detect all instances of the weather app
and load-balance requests between instances.



This tutorial continues in Part 2!


Michael Gruczel
As Advisory Consultant at Dell EMC, Michael Gruczel advises companies in Application Modernization, DevOps, and BigData. Part of this consultancy work is the full stack including steps like Discovery, Software Development, Testing, Infrastructure, Release, Deployment and Operation of Software.

1 Comment
Inline Feedbacks
View all comments
Amit Choudhary
Amit Choudhary
3 years ago

I just wanted to say that your page helped me a ton. I would have never found any deep insight article on Microservices architecture in PCF and Kubernetes . I specially like you explain every step deeply.
It helps to start Microservices architecture in PCF and Kubernetes tutorial .
Either way, thanks. And have a great day!