Introduction
1. Introducing Spring Cloud Data Flow for OpenShift
This project provides support for orchestrating long-running (streaming) and short-lived (task/batch) data microservices to OpenShift 3.
2. Spring Cloud Data Flow
Spring Cloud Data Flow is a cloud-native orchestration service for composable data microservices on modern runtimes. With Spring Cloud Data Flow, developers can create and orchestrate data pipelines for common use cases such as data ingest, real-time analytics, and data import/export.
The Spring Cloud Data Flow architecture consists of a server that deploys Streams and Tasks. Streams are defined using a DSL or visually through the browser based designer UI. Streams are based on the Spring Cloud Stream programming model while Tasks are based on the Spring Cloud Task programming model. The sections below describe more information about creating your own custom Streams and Tasks
For more details about the core architecture components and the supported features, please review Spring Cloud Data Flow’s core reference guide. There’re several samples available for reference.
3. Spring Cloud Stream
Spring Cloud Stream is a framework for building message-driven microservice applications. Spring Cloud Stream builds upon Spring Boot to create standalone, production-grade Spring applications, and uses Spring Integration to provide connectivity to message brokers. It provides opinionated configuration of middleware from several vendors, introducing the concepts of persistent publish-subscribe semantics, consumer groups, and partitions.
For more details about the core framework components and the supported features, please review Spring Cloud Stream’s reference guide.
There’s a rich ecosystem of Spring Cloud Stream Application-Starters that can be used either as standalone data microservice applications or in Spring Cloud Data Flow. For convenience, we have generated RabbitMQ and Apache Kafka variants of these application-starters that are available for use from Maven Repo and Docker Hub as maven artifacts and docker images, respectively.
Do you have a requirement to develop custom applications? No problem. Refer to this guide to create custom stream applications. There’re several samples available for reference.
4. Spring Cloud Task
Spring Cloud Task makes it easy to create short-lived microservices. We provide capabilities that allow short-lived JVM processes to be executed on demand in a production environment.
For more details about the core framework components and the supported features, please review Spring Cloud Task’s reference guide.
There’s a rich ecosystem of Spring Cloud Task Application-Starters that can be used either as standalone data microservice applications or in Spring Cloud Data Flow. For convenience, the generated application-starters are available for use from Maven Repo. There are several samples available for reference.
Features
The Data Flow Server for OpenShift includes the following features over and above those of the Kubernetes Server.
5. Support for Maven Resource
Possibly the most prominent feature of the OpenShift Server besides the ability to deploy to OpenShift is
the ability to support Maven resources. The OpenShift Server supports Docker resources (docker://
) just like the
Kubernetes Server but can additionally handle Maven resources (maven://
) enabled by the OpenShift
Build mechanism.
For example, both the below app registrations (via the Data Flow Shell) are valid and supported:
dataflow:>app register --name http-mvn --type source --uri maven://org.springframework.cloud.stream.app:http-source-rabbit:1.2.0.RELEASE
dataflow:>app import --name http-docker --type source --uri app register --name http --type source --uri docker:springcloudstream/http-source-rabbit:1.2.0.RELEASE
See the Getting Started section for examples of deploying both Docker and Maven resource types.
6. Build Hashing for Maven Resource Apps
When deploying Maven resource (maven://
) based apps, an OpenShift Build
will be triggered to build the Docker image that will in turn be deployed. It is not efficient to trigger a new build
for an app that was already deployed and to which there are no changes detected in the Maven Jar artifact. The resulting image
would essentially be identical every time.
To help with this, the OpenShift Server will create a hash of the Maven artifact located in the local cache. On subsequent deploys of the same app (same Maven artifact) this hash will first be checked against existing buils and if found, a new build will not be triggered but instead the existing image will be used.
This feature can be disabled by specifying the spring.cloud.deployer.openshift.forceBuild=true
as either
a deployer (affects all deployed apps) or deployment (on a per app basis) property.
7. Volumes and Volume Mounts
Volumes and volume mounts provide the ability for a Spring Cloud Stream application to access persistent storage made available on the OpenShift cluster. The supported volume and volume mount types are determined by the underlying kubernetes-model library. All of the volume types that have a generated mode are supported.
Volumes and volume mounts can be specified as server deployer properties as well as app deployment properties specified at deployment time. Both ways of defining the volumes and volume mounts are identical, where they are specified as a JSON representation of the kubernetes-client model.
Volumes and volume mounts defined at deployer level will be added to all deployed apps. This is handy for common shared folders that should be available to all apps. |
Below is an example of a volumes and volume mounts defined as a server deployer property in the ConfigMap:
spring.cloud.deployer.openshift:
volumes:
- name: testhostpath
hostPath:
path: /test/hostPath
- name: testpvc
persistentVolumeClaim:
claimName: testClaim
readOnly: true
- name: testnfs
nfs:
server: 10.0.0.1:111
path: /test/nfs
volumeMounts:
- name: testhostpath:
mountPath: /test/hostPath
- name: testpvc:
mountPath: /test/pvc
- name: testnfs:
mountPath: /test/nfs
readOnly: true
The default value for readOnly is false . I.e. Container requests read/write access.
|
Examples of the deployment property (via the Data Flow Shell) variation of defining volumes and volume mounts below:
dataflow:>stream create --name test --definition "time | file"
Created new stream 'timezoney'
dataflow:>stream deploy test --properties "app.file.spring.cloud.deployer.openshift.deployment.volumes=[{name: testhostpath, hostPath: { path: '/test/override/hostPath' }}],spring.cloud.deployer.openshift.deployment.volumeMounts=[{name: 'testhostpath', mountPath: '/test/hostPath'}]"
Getting Started
The Data Flow Server for OpenShift extends the Kubernetes Server implementation and therefore many of the configuration options and concepts are similar and can in fact be used with the OpenShift server.
Refer to the Spring Cloud Data Flow Server for Kubernetes reference guide.
8. Deploying Streams on OpenShift
The following guide assumes that you have a OpenShift 3 cluster available. This includes both OpenShift Origin and OpenShift Container Platform offerings.
If you do not have a OpenShift cluster available, see the next section which describes running a local OpenShift Origin cluster for development/testing otherwise continue to Installing the Data Flow Server using OpenShift templates.
8.1. A local OpenShift cluster with minishift
There are a few ways to stand up a local OpenShift Origin cluster on your machine for testing. These include:
-
and others
For the purpose of this guide, the minishift tool will be used.
8.1.1. Installation and Getting Started
Install minishift as per the instructions here.
Once you have installed minishift successfully, you can start up a OpenShift instance with minishift start
.
$ minishift start --memory 4096 --cpus 4 --deploy-router
Starting local OpenShift cluster...
oc is now configured to use the cluster.
Run this command to use the cluster:
oc login --username=admin --password=admin
$
The --deploy-router option deploys the default HAProxy Router
which is required to expose and access the Spring Cloud Data Flow UI and other tools.
|
OpenShift Console
The OpenShift Console is a valuable interface into your cluster, it is recommended that you open the console with:
$ minishift console
Opening OpenShift console in default browser...
$
a browser window will open with the console login page. Login with admin
/admin
credentials.
Make sure you wait for the docker-registry and router deployments to successfully deploy before continuing.
These resources are deployed to the default project.
|
oc
CLI tool
You can also manage the local cluster with the oc
CLI tool.
If you do not have the oc
tool installed, follow the instructions here.
Login and use the local instance with:
$ oc login --username=admin --password=admin
Login successful.
You have access to the following projects and can switch between them with 'oc project <projectname>':
* default
kube-system
openshift
openshift-infra
Using project "default".
$
8.2. Creating a new Project
To group the resources created as part of this guide, create a new Project.
You can do this using the Console or oc
tool. Below is an example using the 'oc' tool:
$ oc new-project scdf --description="Spring Cloud Data Flow"
Now using project "scdf" on server "https://192.168.64.13:8443".
...
$
The IP address (192.168.64.13) assigned will vary each time you use minishift start , so adjust accordingly.
The active project should be scdf (check with oc project ) and should be the project used for the rest of this guide.
|
8.3. Installing the Data Flow Server using OpenShift templates
To install a Data Flow Server and supporting infrastructure components to OpenShift, we will use OpenShift templates. Templates allow you to deploy a predefined set of resources with sane default configurations which can be optionally configured via parameters for specific environments.
The templates for the Data Flow Server for OpenShift are available in the src/etc/openshift
directory in this project’s GitHub repository.
There are several templates available:
-
Data Flow Server only - This template only deploys the Spring Cloud Data Flow Server for OpenShift and no other resources. This template provides the capability to provide the configuration for Spring Cloud Stream binder implementation, RDBMS and Redis resources with sane defaults. This template is suited for environments where these existing resources are already deployed.
-
Data Flow Server with ephemeral Datasources - Deploys Data Flow Server for OpenShift as well as MySQL and Redis containers without persistent volumes. I.e. the data persisted by these containers will be lost on restart.
-
Data Flow Server with ephemeral Datasources and Kafka binder - Same as above but with an additional Kafka instance for use as the Spring Cloud Stream binder implementation.
-
Data Flow Server with ephemeral Datasources and RabbitMQ binder - Same as above but with a RabbitMQ instance for use as the binder implementation.
8.3.1. Installing the OpenShift templates
You can install the above templates using the OpenShift Console or oc
tool.
You would have to clone or download the Data Flow Server for OpenShift project and
import the templates in the src/etc/openshift
directory one by one using the Console or oc create -f …
.
However, a more convenient and the recommended way of installing all the templates is to run the following:
$ curl https://raw.githubusercontent.com/donovanmuller/spring-cloud-dataflow-server-openshift/v1.2.1.RELEASE/src/etc/openshift/install-templates.sh | bash
Installing OpenShift templates into project 'scdf'...
Archive: /tmp/scdf-openshift-templates.zip
inflating: /tmp/scdf-openshift-templates/scdf-ephemeral-datasources-kafka-template.yaml
inflating: /tmp/scdf-openshift-templates/scdf-ephemeral-datasources-rabbitmq-template.yaml
inflating: /tmp/scdf-openshift-templates/scdf-ephemeral-datasources-template.yaml
inflating: /tmp/scdf-openshift-templates/scdf-sa.yaml
inflating: /tmp/scdf-openshift-templates/scdf-template.yaml
Installing template '/tmp/scdf-openshift-templates/scdf-ephemeral-datasources-kafka-template.yaml'
template "spring-cloud-dataflow-server-openshift-ephemeral-kafka" replaced
Installing template '/tmp/scdf-openshift-templates/scdf-ephemeral-datasources-rabbitmq-template.yaml'
template "spring-cloud-dataflow-server-openshift-ephemeral-rabbitmq" replaced
Installing template '/tmp/scdf-openshift-templates/scdf-ephemeral-datasources-template.yaml'
template "spring-cloud-dataflow-server-openshift-ephemeral-datasources" replaced
Installing template '/tmp/scdf-openshift-templates/scdf-sa.yaml'
serviceaccount "scdf" replaced
Installing template '/tmp/scdf-openshift-templates/scdf-template.yaml'
template "spring-cloud-dataflow-server-openshift" replaced
Adding 'edit' role to 'scdf' Service Account...
Adding 'scdf' Service Account to the 'anyuid' SCC...
Templates installed.
$
This will download all the templates and install them into the scdf
project by default. It will also create and
configure a required Service Account mentioned below.
The project can be specified by using -s scdf
after the bash
command above.
8.3.2. Creating and configuring Service Accounts
The Data Flow Server requires a Service Account (named scdf
),
which grants it access to perform actions such as reading ConfigMaps and Secrets, creating Builds, etc.
To create the scdf
Service Account, use the oc
tool from the src/etc/openshift
directory:
$ oc create -f scdf-sa.yaml
...
If you used the install-templates.sh script above to install the templates, the scdf
Service Account would have already been created for you.
|
The scdf
Service Account must have the edit
role
added to it in order to have the correct permissions to function properly.
Add the edit
role with the following:
$ oc policy add-role-to-user edit system:serviceaccount:scdf:scdf
...
If you used the install-templates.sh script above to install the templates, the scdf
Service Account would already have the edit role added to it.
|
The scdf
Service Account also needs to be added to the anyuid
Security Context Constraint to allow the MySQL
Pod to run using the root
user.
By default OpenShift starts a Pod using a random user Id. Add the Service Account to the anyuid
SCC group with:
$ oc adm policy add-scc-to-user anyuid system:serviceaccount:scdf:scdf
If you used the install-templates.sh script above to install the templates, the scdf
Service Account is already added to the anyuid SCC.
|
8.3.3. Installing the Data Flow Server
For this guide we’ll use the Data Flow Server with ephemeral Datasources and Kafka binder
template to start a Data Flow Server in the scdf
project.
First, using the OpenShift Console, click the Add to Project button. You should see the list of templates mentioned above.
Choose the spring-cloud-dataflow-server-openshift-ephemeral-kafka
template.
Default configuration values are provided but can be updated to meet your needs if necessary.
To avoid deployments failing due to long image pull times, you can manually pull the requires images. Note that you should first change your local Docker client to use the Docker engine in the minishift VM
The above step is optional as OpenShift will also pull the required images. However, depending on your network speed, deployments may fail due to timeout. If this happens, simply start another deployment of the component by click the Deploy button when viewing the deployment. |
After updating the configuration values or leaving the default values, click the Create button to deploy this template.
Pulling the various Docker images may take some time, so please be patient. Once all the images have been pulled, the various pods will start and should all appear as dark blue circles.
The Data Flow Server will by default deploy apps only in the project that it itself is deployed. I.e. a Data Flow Server
deployed in the default project will not be able to deploy applications to the scdf project. The recommended configuration
is a Data Flow Server per project.
|
Verify that the Data Flow Server has started successfully by clicking on the exposed Route URL.
The UI is mapped to /dashboard
|
If you’d like to reset or perhaps try another template, you can remove the Data Flow Server and other resources created by the template with:
|
8.4. Download and run the Spring Cloud Data Flow Shell
Download and run the Shell, targeting the Data Flow Server exposed via a Route.
$ wget http://repo.spring.io/release/org/springframework/cloud/spring-cloud-dataflow-shell/1.2.3.RELEASE/spring-cloud-dataflow-shell-1.2.3.RELEASE.jar
$ java -jar spring-cloud-dataflow-shell-1.2.3.RELEASE.jar --dataflow.uri=http://scdf-kafka-scdf.192.168.64.15.xip.io/
____ ____ _ __
/ ___| _ __ _ __(_)_ __ __ _ / ___| | ___ _ _ __| |
\___ \| '_ \| '__| | '_ \ / _` | | | | |/ _ \| | | |/ _` |
___) | |_) | | | | | | | (_| | | |___| | (_) | |_| | (_| |
|____/| .__/|_| |_|_| |_|\__, | \____|_|\___/ \__,_|\__,_|
____ |_| _ __|___/ __________
| _ \ __ _| |_ __ _ | ___| | _____ __ \ \ \ \ \ \
| | | |/ _` | __/ _` | | |_ | |/ _ \ \ /\ / / \ \ \ \ \ \
| |_| | (_| | || (_| | | _| | | (_) \ V V / / / / / / /
|____/ \__,_|\__\__,_| |_| |_|\___/ \_/\_/ /_/_/_/_/_/
1.2.3.RELEASE
Welcome to the Spring Cloud Data Flow shell. For assistance hit TAB or type "help".
dataflow:>
8.5. Registering Stream applications with Docker resource
Now register all out-of-the-box stream applications using the Docker resource type, built with the Kafka binder in bulk with the following command.
For more details, review how to register applications.
dataflow:>app import --uri http://bit.ly/stream-applications-kafka-docker
Successfully registered applications: [source.tcp, sink.jdbc, source.http, sink.rabbit, source.rabbit, source.ftp, sink.gpfdist, processor.transform, source.loggregator, source.sftp, processor.filter, sink.cassandra, processor.groovy-filter, sink.router, source.trigger, sink.hdfs-dataset, processor.splitter, source.load-generator, processor.tcp-client, source.time, source.gemfire, source.twitterstream, sink.tcp, source.jdbc, sink.field-value-counter, sink.redis-pubsub, sink.hdfs, processor.bridge, processor.pmml, processor.httpclient, source.s3, sink.ftp, sink.log, sink.gemfire, sink.aggregate-counter, sink.throughput, source.triggertask, sink.s3, source.gemfire-cq, source.jms, source.tcp-client, processor.scriptable-transform, sink.counter, sink.websocket, source.mongodb, source.mail, processor.groovy-transform, source.syslog]
8.6. Deploy a simple stream in the shell
Create a simple ticktock
stream definition and deploy it immediately using the following command:
dataflow:>stream create --name ticktock --definition "time | log" --deploy
Created new stream 'ticktock'
Deployment request has been sent
Watch the OpenShift Console as the two application resources are created and the Pods are started. Once the Docker images are pulled and the Pods are started up, you should see the Pods with dark blue circles:
You can also verify the deployed apps using the oc tool
$ oc get pods
NAME READY STATUS RESTARTS AGE
...
ticktock-log-0-2-it3ja 1/1 Running 0 7m
ticktock-time-2-sxqnp 1/1 Running 0 6m
To verify that the stream is working as expected, tail the logs of the ticktock-log
app either using the OpenShift Console:
or the oc
tool:
$ oc logs -f ticktock-log
...
... INFO 1 --- [afka-listener-1] log-sink : 11/29/16 14:49:59
... INFO 1 --- [afka-listener-1] log-sink : 11/29/16 14:50:01
... INFO 1 --- [afka-listener-1] log-sink : 11/29/16 14:50:02
... INFO 1 --- [afka-listener-1] log-sink : 11/29/16 14:50:03
... INFO 1 --- [afka-listener-1] log-sink : 11/29/16 14:50:04
... INFO 1 --- [afka-listener-1] log-sink : 11/29/16 14:50:05
... INFO 1 --- [afka-listener-1] log-sink : 11/29/16 14:50:06
...
8.7. Registering Stream applications with Maven resource
The distinguishing feature of the Data Flow Server for OpenShift is that it has the capability to deploy applications registered with the Maven resource type
in addition to the Docker resource type. Using the ticktock
stream example above, we will create a similar stream definition
but using the Maven resource versions of the apps.
For this example we will register the apps individually using the following command:
dataflow:>app register --type source --name time-mvn --uri maven://org.springframework.cloud.stream.app:time-source-kafka:1.2.0.RELEASE
Successfully registered application 'source:time-mvn'
dataflow:>app register --type sink --name log-mvn --uri maven://org.springframework.cloud.stream.app:log-sink-kafka:1.2.0.RELEASE
Successfully registered application 'sink:log-mvn'
We couldn’t bulk import the Maven version of the apps as we did for the Docker versions because the app names
would conflict, as the names defined in the bulk import files are the same across resource types. Hence we register the
Maven apps with a -mvn suffix.
|
8.8. Deploy a simple stream in the shell
Create a simple ticktock-mvn
stream definition and deploy it immediately using the following command:
dataflow:>stream create --name ticktock-mvn --definition "time-mvn | log-mvn" --deploy
Created new stream 'ticktock-mvn'
Deployment request has been sent
There could be a slight delay once the above command is issued. This is due to the Maven artifacts being resolved and cached locally. Depending on the size of the artifacts, this could take some time. |
Watch the OpenShift Console as the two application resources are created. Notice this time, that instead of the Pods
being started, that a Build has been started instead.
The Build will execute and create a Docker image, using the default Dockerfile
,
containing the app. The resultant Docker image will be pushed to the internal OpenShift registry,
where the deployment resource will be triggered when the image has been successfully pushed.
The deployment will then scale the app Pod up, starting the application.
To verify that the stream is working as expected, tail the logs of the ticktock-log-mvn
app using the oc
tool:
$ oc get pods
NAME READY STATUS RESTARTS AGE
...
ticktock-mvn-log-mvn-0-1-agpl6 1/1 Running 0 4m
ticktock-mvn-log-mvn-1-build 0/1 Completed 0 1h
ticktock-mvn-time-mvn-1-12ikj 1/1 Running 0 1m
ticktock-mvn-time-mvn-1-build 0/1 Completed 0 1h
$ oc logs -f ticktock-mvn-log-mvn-0-1-agpl6
... INFO 1 --- [afka-listener-1] log-sink : 11/29/16 18:34:23
... INFO 1 --- [afka-listener-1] log-sink : 11/29/16 18:34:25
... INFO 1 --- [afka-listener-1] log-sink : 11/29/16 18:34:26
... INFO 1 --- [afka-listener-1] log-sink : 11/29/16 18:34:27
9. Deploying Tasks on OpenShift
Deploying Task applications using the Data Flow Server for OpenShift is a similar affair to deploying Stream apps. Therefore, for brevity, only the Maven resource version of the task will be shown as an example.
9.1. Registering Task application with Maven resource
This time we will bulk import the Task application, as we do not have any Docker resource versions imported which would cause conflicts in naming. Import all Maven task applications with the following command:
dataflow:>app import --uri http://bit.ly/1-0-1-GA-task-applications-maven
9.2. Launch a simple task in the shell
Let’s create a simple task definition and launch it.
dataflow:>task create task1 --definition "timestamp"
dataflow:>task launch task1
Note that when the task is launched, an OpenShift Build is started to build the relevant Docker image containing the task app. Once the Build has completed successfully, pushing the built image to the internal registry, a bare Pod is started, executing the task.
Verify that the task executed successfully by executing these commands:
dataflow:>task list
╔═════════╤═══════════════╤═══════════╗
║Task Name│Task Definition│Task Status║
╠═════════╪═══════════════╪═══════════╣
║task1 │timestamp │complete ║
╚═════════╧═══════════════╧═══════════╝
dataflow:>task execution list
╔═════════╤══╤═════════════════════════════╤═════════════════════════════╤═════════╗
║Task Name│ID│ Start Time │ End Time │Exit Code║
╠═════════╪══╪═════════════════════════════╪═════════════════════════════╪═════════╣
║task1 │1 │Wed Nov 30 13:13:02 SAST 2016│Wed Nov 30 13:13:02 SAST 2016│0 ║
╚═════════╧══╧═════════════════════════════╧═════════════════════════════╧═════════╝
You can also view the task execution status by using the Data Flow Server UI.
9.2.1. Cleanup completed tasks
If you want to delete the Build and Pod created by this task execution, execute the following:
dataflow:>task destroy --name task1
Configuration
As the OpenShift Server is based on the Kubernetes Server, all the configuration is mostly identical. Please see the Kubernetes Server reference guide for configuration options.
10. Maven Configuration
The Maven configuration is important for resolving Maven app artifacts. The following example taken from the Data Flow Server only template, configures a remote Maven repository:
maven:
resolvePom: true
remote-repositories.spring:
url: http://repo.spring.io/libs-snapshot
auth:
username:
password:
Where the resolvePom
is important for determining the build strategy used.
See the OpenShift templates for reference.
More configuration options can be seen in the Configure Maven Properties section in the Data Flow reference documentation. |
11. Dockerfile Resolution Strategies
The Data Flow Server for OpenShift uses the Docker build strategy. The default strategy for resolving a Dockerfile is to use the built in Dockerfile included in the OpenShift deployer. However, there are three other strategies available:
-
If a remote Git URI is specified when creating the stream/task definition using the
spring.cloud.deployer.openshift.build.git.uri
property, this repository will be used and takes highest precedence. -
If
src/main/docker/Dockerfile
is detected in the Maven artifact Jar, then it is assumed that theDockerfile
will exist in that location in a remote Git repository. In that case, the Git repository source is used in conjunction with the Docker build strategy. The remote Git URI and ref are extracted from the<scm><connection></connection></scm>
and<scm><tag></tag></scm>
tags in thepom.xml
(if available) of the Maven Jar artifact. For example, if the<scm><connection>
value wasscm:git:git@github.com:spring-cloud/spring-cloud-dataflow.git
, then the remote Git URI would be parsed asssh://git@github.com:spring-cloud/spring-cloud-dataflow.git
. In short, theDockerfile
from the remote Git repository for the app being deployed will be used (OpenShift actually clones the Git repo) as the source for the image build. Of course, you can include and customise whatever and however you like in thisDockerfile
. -
The other strategy uses the contents of a
Dockerfile
located in one of three locations as the Dockerfile source:-
The file system location of a Dockerfile indicated by the
spring.cloud.deployer.openshift.deployment.dockerfile
deployment property. E.g.--properties "spring.cloud.deployer.openshift.deployment.dockerfile=/tmp/deployer/Dockerfile"
. The contents of this file will be used as the source input for the build. -
The inline Dockerfile content as provided in the
spring.cloud.deployer.openshift.deployment.dockerfile
deployment property. E.g.--properties "spring.cloud.deployer.openshift.deployment.dockerfile=FROM java:8\n RUN wget …"
-
The default Dockerfile provided by the OpenShift deployer.
-
Server Implementation
12. Server Properties
The Spring Data Flow Server for OpenShift is a specialisation of the Spring Cloud Data Flow Server for Kubernetes. Therefore, all properties supported by the Kubernetes Server are supported by the OpenShift server.
The spring.cloud.deployer.kubernetes prefix should be replaced with spring.cloud.deployer.openshift .
|
See Data Flow Server for Kubernetes reference documentation for supported properties.
12.1. OpenShift Specific Properties
The following properties are specific to the Data Flow OpenShift Server.
Name | Usage Example | Description |
---|---|---|
Force Build |
|
Ignore the build hashing feature when deploying streams and always trigger a new build for Maven based apps |
Default Routing Subdomain |
|
Provide the routing subdomain used when building Route URL’s. |
Default Image Tag |
|
The default Docker image tag to be used when creating Build and DeploymentConfig resources |
Application Properties
The following application properties are supported by the Data Flow Server for OpenShift. These properties are passed as application properties when defining streams or tasks. Below is an example of defining a stream:
dataflow:>stream create --name test --definition "time | custom --spring.cloud.deployer.openshift.build.git.uri=https://github.com/donovanmuller/timely-application-group.git | log"
Created new stream 'test'
Note the application property spring.cloud.deployer.openshift.build.git.uri=https://github.com/donovanmuller/timely-application-group.git
.
13. Supported Application Properties
Name | Usage Example | Description |
---|---|---|
Build Git URI |
spring.cloud.deployer.openshift.build.git.uri=https://github.com/donovanmuller/timely-application-group.git |
The Git remote repository URI that will contain a Dockerfile in |
Git Branch Reference |
spring.cloud.deployer.openshift.build.git.ref=master |
The Git branch/reference for the repository specified by |
Dockerfile Path |
spring.cloud.deployer.openshift.build.git.dockerfile=src/main/docker |
The location, relative to the project root of the Git repository, where the Dockerfile is located. |
Git Repository Secret |
spring.cloud.deployer.openshift.build.git.secret=github-secret |
If the remote Git repository requires authentication, use this secret. See here |
Deployment Properties
The following deployment properties are supported by the Data Flow Server for OpenShift. These properties are passed as deployment properties when deploying streams or tasks. Below is an example of deploying a stream definition:
dataflow:>stream create --name test --definition "time | custom | log"
Created new stream 'test'
dataflow:>stream deploy test --properties "app.custom.spring.cloud.deployer.openshift.defaultDockerfile=Dockerfile.nexus"
Deployment request has been sent for stream 'test'
Note the deployment property app.custom.spring.cloud.deployer.openshift.defaultDockerfile=Dockerfile.nexus
.
14. Supported Deployment Properties
Name | Usage Example | Description |
---|---|---|
Force Build |
spring.cloud.deployer.openshift.forceBuild=true |
A flag (true/false) indicating whether to ignore the build hashing feature when deploying streams and always trigger a new build for Maven based apps |
Servie Account |
spring.cloud.deployer.openshift.deployment.service.account=scdf |
OpenShift ServiceAccount that containers should run under |
Docker Image Tag |
spring.cloud.deployer.openshift.image.tag=latest |
The Docker image tag for the Image Stream used when creating the Deployment |
Inline Dockerfile |
spring.cloud.deployer.openshift.deployment.dockerfile='FROM java:8\nRUN echo "Custom Dockerfile…"' |
An inline Dockerfile that will be used to build the Docker image. Only applicable to Maven resource apps |
Node Selector |
spring.cloud.deployer.openshift.deployment.nodeSelector=region: primary,role: streams |
A comma separated list of node selectors (in the form name: value) which will determine where the app’s Pod’s get assigned |
Default Provided Dockerfile |
spring.cloud.deployer.openshift.defaultDockerfile=Dockerfile.nexus |
Specify which default Dockerfile to use when building Docker images. There are currently two supported default Dockerfiles |
Create Route |
spring.cloud.deployer.openshift.createRoute=true |
A flag (true/false) indicating whether a Route should be created for the app. Analogous to spring.cloud.deployer.kubernetes.createLoadBalancer |
Route Host Name |
spring.cloud.deployer.openshift.deployment.route.host=myapp.mycompany.com |
Provide a Route Host value that will the created Route will expose as the URL to the app |
Volume Mounts |
spring.cloud.deployer.openshift.deployment.volumeMounts=[{name: 'testhostpath', mountPath: '/test/hostPath'}, {name: 'testpvc', mountPath: '/test/pvc'}, {name: 'testnfs', mountPath: '/test/nfs'}] |
A list of kubernetes-model supported volume mounts. Specified as a JSON representation |
Volumes |
spring.cloud.deployer.openshift.deployment.volumes=[{name: testhostpath, hostPath: { path: '/test/override/hostPath' }}, {name: 'testpvc', persistentVolumeClaim: { claimName: 'testClaim', readOnly: 'true' }}, {name: 'testnfs', nfs: { server: '10.0.0.1:111', path: '/test/nfs' }}] |
A list of kubernetes-model supported volumes. Specified as a JSON representation. Volumes must have corresponding volume mounts, otherwise they will be ignored |
Labels |
spring.cloud.deployer.openshift.deployment.labels=project=test,team=a-team |
A comma separated list of labels (in the form name=value) that will be added to the app |
Create Node Port |
spring.cloud.deployer.openshift.createNodePort=true |
Create a NodePort instead of a Route. Either "true" or a number at deployment time. The value "true" will choose a random port. If a number is given it must be in the range that is configured for the cluster (service-node-port-range, default is 30000-32767) |
‘How-to’ guides
This section provides answers to some common ‘how do I do that…’ type of questions that often arise when using Spring Cloud Data Flow.
15. Deploying Custom Stream App as a Maven Resource
This section walks you through deploying a simple Spring Cloud Stream based application, packaged as a Maven artifact, to OpenShift. The source code for this app is available in the following GitHub repository.
This guide assumes that you have gone through the Getting Started section and are using a local minishift instance of OpenShift. Adjust the steps accordingly if you are using an existing OpenShift cluster.
15.1. Deploy a Nexus Repository
For OpenShift to build the Docker image that will be deployed, it must be able to resolve and download the custom app’s Jar artifact. This means that the custom app must be deployed to an accessible Maven repository.
Assuming the local minishift OpenShift environment discussed in the Getting Started section, we will deploy a Nexus container to which we can deploy our custom application. Deploying the Nexus image is trivial thanks to an OpenShift template available here.
Make sure you have configured the scdf Service Account mentioned in the Getting Started section, as this account is
used by the Nexus deployment.
|
Using the oc
tool, upload the Nexus template with:
$ oc create -f https://raw.githubusercontent.com/donovanmuller/spring-cloud-dataflow-server-openshift/v1.2.1.RELEASE/src/etc/openshift/nexus-template.yaml
...
Once uploaded, open the OpenShift Console (you can use minishift console
), authenticate and navigate to the scdf
project.
Then click Add to Project and select the nexus
template:
The default configurations should be sane. However, you must provide the The OpenShift Route host value.
This value will depend on your minishift and OpenShift Router environment but should be similar to
nexus-scdf.192.168.64.15.xip.io
. Where -scdf.192.168.64.15.xip.io
is the project name (scdf
) and
the default routing subdomain (192.168.64.15.xip.io
).
Click Create when you’re happy with the configuration. Wait for the Nexus image to be pulled and the deployment to be successful. Once the Pod has been scaled successfully you should be able to access the Nexus UI by clicking on the Route URL (nexus-scdf.192.168.64.15.xip.io in this example).
The default credential for Nexus is admin /admin123 or deployment /deployment123
|
15.2. Configuring the Data Flow Server for OpenShift
We need to configure the Data Flow Server to use this new Nexus instance as a remote Maven repository. If you have an existing deployment from the Getting Started section you will have to change it’s configuration.
There are a few ways to do that but the way described here is to remove the existing deployment and use the existing Data Flow Server with ephemeral Datasources and Kafka binder template to deploy the updated configuration.
Remove the current environment using the oc
tool (assuming you used the Kafka template):
$ oc delete all --selector=template=scdf-kafka
$ oc delete cm --selector=template=scdf-kafka
$ oc delete secret --selector=template=scdf-kafka
Next click Add to Project and select the spring-cloud-dataflow-server-openshift-ephemeral-kafka
template.
Use the following values for the Maven configuration items:
Configuration Parameter | Value |
---|---|
Remote Maven repository name |
nexus |
Remote Maven repository URL |
nexus-scdf.192.168.64.15.xip.io/content/groups/public (use your Route URL for Nexus here) |
Remote Maven repository username |
deployment |
Remote Maven repository password |
deployment123 |
Click Create and wait for the deployment to complete successfully.
15.3. Cloning and Deploying the App
Next step is to deploy our custom app into the Nexus instance. First though, we need to clone the custom app source.
$ git clone https://github.com/donovanmuller/timezone-processor-kafka.git
$ cd timezone-processor-kafka
Next we deploy the application into our Nexus repository with
$ ./mvnw -s .settings.xml deploy -Dnexus.url=http://nexus-scdf.192.168.64.15.xip.io/content/repositories/snapshots
...
Uploading: http://nexus-scdf.192.168.64.15.xip.io/content/repositories/snapshots/io/switchbit/timezone-processor-kafka/maven-metadata.xml
Uploaded: http://nexus-scdf.192.168.64.15.xip.io/content/repositories/snapshots/io/switchbit/timezone-processor-kafka/maven-metadata.xml (294 B at 6.0 KB/sec)
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 13.156 s
[INFO] Finished at: 2016-11-30T20:42:54+02:00
[INFO] Final Memory: 35M/302M
[INFO] ------------------------------------------------------------------------
Substitute the value for -Dnexus.url with the URL matching your Nexus instance.
|
15.4. Deploying the Stream
Now that our custom app is ready, let’s register it with the Data Flow Server.
Using the Data Flow Shell, targeted to our OpenShift deployed instance, register the timezone
app with:
dataflow:>app register --name timezone --type processor --uri maven://io.switchbit:timezone-processor-kafka:1.0-SNAPSHOT
Successfully registered application 'processor:timezone'
The assumption is that the out-of-the-box apps have been imported previously as part of the Getting Started section. If the apps are not imported, import them now with:
It does not really matter whether the Docker or Maven out-of-the-box apps are registered. |
Now we can define a stream using our timezone processor with:
dataflow:>stream create --name timezoney --definition "time | timezone | log"
Created new stream 'timezoney'
and deploy it with:
dataflow:>stream deploy timezoney --properties "app.timezone.timezone=Africa/Johannesburg,app.timezone.spring.cloud.deployer.openshift.defaultDockerfile=Dockerfile.nexus"
Deployment request has been sent for stream 'timezoney'
We provide two deployment properties to the timezone app. The first is the required timezone to convert the input times.
The second is to inform the Data Flow Server that it should use the provided default Dockerfile (Dockerfile.nexus ) that supports Nexus repositories,
instead of the default Docker.artifactory .
|
You should see a build being triggered for the timezone
app, which will download the timezone-processor-kafka
Maven artifact from the Nexus repository and build the Docker image. Once the build is successful, the app will be deployed
alongside the other apps.
View both the timezoney-timezone-0
and timezoney-log-0
apps for the expected log outputs.
Once you’re done, destroy the stream with:
dataflow:>stream destroy timezoney
Destroyed stream 'timezoney'