Content

Vijona

6 Feb at 15:38

Welcome to Spring Batch Example

Spring Batch is a spring framework module for execution of batch job. We can use spring batch to process a series of jobs.

Spring Batch Example Overview

Before going through the example program, let’s get some idea about spring batch terminologies.

Understanding Jobs and Steps

A job can consist of ‘n’ number of steps. Each step contains Read-Process-Write task or it can have single operation, which is called tasklet.

Read-Process-Write is basically read from a source like Database, CSV etc. then process the data and write it to a source like Database, CSV, XML etc.

Tasklet means doing a single task or operation like cleaning of connections, freeing up resources after processing is done.

Read-Process-Write and tasklets can be chained together to run a job.

Implementing Spring Batch: A Working Example

Let us consider a working example for implementation of spring batch. We will consider the following scenario for implementation purpose. A CSV file containing data needs to be converted as XML along with the data and tags will be named after the column name.

Below are the important tools and libraries used for spring batch example.

Apache Maven 3.5.0 – for project build and dependencies management.
Eclipse Oxygen Release 4.7.0 – IDE for creating spring batch maven application.
Java 1.8
Spring Core 4.3.12.RELEASE
Spring OXM 4.3.12.RELEASE
Spring JDBC 4.3.12.RELEASE
Spring Batch 3.0.8.RELEASE
MySQL Java Driver 5.1.25 – use based on your MySQL installation. This is required for Spring Batch metadata tables.

Spring Batch Maven Dependencies

Below is the content of pom.xml file with all the required dependencies for our spring batch example project.

Copy Code


<project xmlns="https://maven.apache.org/POM/4.0.0" xmlns:xsi="https://www.w3.org/2001/XMLSchema-instance"
	xsi:schemaLocation="https://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
	<modelVersion>4.0.0</modelVersion>

	<groupId>com.journaldev.spring</groupId>
	<artifactId>SpringBatchExample</artifactId>
	<version>0.0.1-SNAPSHOT</version>
	<packaging>jar</packaging>

	<name>SpringBatchDemo</name>
	<url>https://maven.apache.org</url>

	<properties>
		<jdk.version>1.8</jdk.version>
		<spring.version>4.3.12.RELEASE</spring.version>
		<spring.batch.version>3.0.8.RELEASE</spring.batch.version>
		<mysql.driver.version>5.1.25</mysql.driver.version>
		<junit.version>4.11</junit.version>
	</properties>

	<dependencies>

		<!-- Spring Core -->
		<dependency>
			<groupId>org.springframework</groupId>
			<artifactId>spring-core</artifactId>
			<version>${spring.version}</version>
		</dependency>

		<!-- Spring jdbc, for database -->
		<dependency>
			<groupId>org.springframework</groupId>
			<artifactId>spring-jdbc</artifactId>
			<version>${spring.version}</version>
		</dependency>

		<!-- Spring XML to/back object -->
		<dependency>
			<groupId>org.springframework</groupId>
			<artifactId>spring-oxm</artifactId>
			<version>${spring.version}</version>
		</dependency>

		<!-- MySQL database driver -->
		<dependency>
			<groupId>mysql</groupId>
			<artifactId>mysql-connector-java</artifactId>
			<version>${mysql.driver.version}</version>
		</dependency>

		<!-- Spring Batch dependencies -->
		<dependency>
			<groupId>org.springframework.batch</groupId>
			<artifactId>spring-batch-core</artifactId>
			<version>${spring.batch.version}</version>
		</dependency>
		<dependency>
			<groupId>org.springframework.batch</groupId>
			<artifactId>spring-batch-infrastructure</artifactId>
			<version>${spring.batch.version}</version>
		</dependency>

		<!-- Spring Batch unit test -->
		<dependency>
			<groupId>org.springframework.batch</groupId>
			<artifactId>spring-batch-test</artifactId>
			<version>${spring.batch.version}</version>
		</dependency>

		<!-- Junit -->
		<dependency>
			<groupId>junit</groupId>
			<artifactId>junit</artifactId>
			<version>${junit.version}</version>
			<scope>test</scope>
		</dependency>

		<dependency>
			<groupId>com.thoughtworks.xstream</groupId>
			<artifactId>xstream</artifactId>
			<version>1.4.10</version>
		</dependency>

	</dependencies>
	<build>
		<finalName>spring-batch</finalName>
		<plugins>
			<plugin>
				<groupId>org.apache.maven.plugins</groupId>
				<artifactId>maven-eclipse-plugin</artifactId>
				<version>2.9</version>
				<configuration>
					<downloadSources>true</downloadSources>
					<downloadJavadocs>false</downloadJavadocs>
				</configuration>
			</plugin>
			<plugin>
				<groupId>org.apache.maven.plugins</groupId>
				<artifactId>maven-compiler-plugin</artifactId>
				<version>2.3.2</version>
				<configuration>
					<source>${jdk.version}</source>
					<target>${jdk.version}</target>
				</configuration>
			</plugin>
		</plugins>
	</build>
</project>

Spring Batch Processing CSV Input File

Here is the content of our sample CSV file for processing.

Copy Code

1001,Tom,Moody, 29/7/2013 1002,John,Parker, 30/7/2013 1003,Henry,Williams, 31/7/2013

Spring Batch Job Configuration

We have to define spring bean and spring batch job in a configuration file. Below is the content of job-batch-demo.xml file, it’s the most important part of our project.

Copy Code


<beans xmlns="https://www.springframework.org/schema/beans"
    xmlns:batch="https://www.springframework.org/schema/batch" xmlns:xsi="https://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="https://www.springframework.org/schema/batch
        https://www.springframework.org/schema/batch/spring-batch-3.0.xsd
        https://www.springframework.org/schema/beans 
        https://www.springframework.org/schema/beans/spring-beans-4.3.xsd">

    <import resource="../config/context.xml" />
    <import resource="../config/database.xml" />

    <bean id="report" class="com.journaldev.spring.model.Report"
        scope="prototype" />
    <bean id="itemProcessor" class="com.journaldev.spring.CustomItemProcessor" />

    <batch:job id="DemoJobXMLWriter">
        <batch:step id="step1">
            <batch:tasklet>
                <batch:chunk reader="csvFileItemReader" writer="xmlItemWriter"
                    processor="itemProcessor" commit-interval="10">
                </batch:chunk>
            </batch:tasklet>
        </batch:step>
    </batch:job>

    <bean id="csvFileItemReader" class="org.springframework.batch.item.file.FlatFileItemReader">

        <property name="resource" value="classpath:csv/input/report.csv" />

        <property name="lineMapper">
            <bean class="org.springframework.batch.item.file.mapping.DefaultLineMapper">
                <property name="lineTokenizer">
                    <bean
                        class="org.springframework.batch.item.file.transform.DelimitedLineTokenizer">
                        <property name="names" value="id,firstname,lastname,dob" />
                    </bean>
                </property>
                <property name="fieldSetMapper">
                    <bean class="com.journaldev.spring.ReportFieldSetMapper" />

                    <!-- if no data type conversion, use BeanWrapperFieldSetMapper to map 
                        by name <bean class="org.springframework.batch.item.file.mapping.BeanWrapperFieldSetMapper"> 
                        <property name="prototypeBeanName" value="report" /> </bean> -->
                </property>
            </bean>
        </property>

    </bean>

    <bean id="xmlItemWriter" class="org.springframework.batch.item.xml.StaxEventItemWriter">
        <property name="resource" value="file:xml/outputs/report.xml" />
        <property name="marshaller" ref="reportMarshaller" />
        <property name="rootTagName" value="report" />
    </bean>

    <bean id="reportMarshaller" class="org.springframework.oxm.jaxb.Jaxb2Marshaller">
        <property name="classesToBeBound">
            <list>
                <value>com.journaldev.spring.model.Report</value>
            </list>
        </property>
    </bean>

</beans>

We are using FlatFileItemReader to read CSV file, CustomItemProcessor to process the data and write to XML file using StaxEventItemWriter.
batch:job – This tag defines the job that we want to create. Id property specifies the ID of the job. We can define multiple jobs in a single xml file.
batch:step – This tag is used to define different steps of a spring batch job.
Two different types of processing style is offered by Spring Batch Framework, which are “TaskletStep Oriented” and “Chunk Oriented”. Chunk Oriented style is used in this example refers to reading the data one by one and creating ‘chunks’ that will be written out, within a transaction boundary.
reader: spring bean used for reading the data. We have used csvFileItemReader bean in this example that is instance of FlatFileItemReader.
processor: this is the class which is used for processing the data. We have used CustomItemProcessor in this example.
writer: bean used to write data into xml file.
commit-interval: This property defines the size of the chunk which will be committed once processing is done. Basically it means that ItemReader will read the data one by one and ItemProcessor will also process it the same way but ItemWriter will write the data only when it equals the size of commit-interval.
Three important interface that are used as part of this project are ItemReader, ItemProcessor and ItemWriter from org.springframework.batch.item package.

Spring Batch Model Class

First of all we are reading CSV file into java object and then using JAXB to write it to xml file. Below is our model class with required JAXB annotations.

Copy Code


package com.journaldev.spring.model;

import java.util.Date;

import javax.xml.bind.annotation.XmlAttribute;
import javax.xml.bind.annotation.XmlElement;
import javax.xml.bind.annotation.XmlRootElement;

@XmlRootElement(name = "record")
public class Report {

    private int id;
    private String firstName;
    private String lastName;
    private Date dob;

    @XmlAttribute(name = "id")
    public int getId() {
        return id;
    }

    public void setId(int id) {
        this.id = id;
    }

    @XmlElement(name = "firstname")
    public String getFirstName() {
        return firstName;
    }

    public void setFirstName(String firstName) {
        this.firstName = firstName;
    }

    @XmlElement(name = "lastname")
    public String getLastName() {
        return lastName;
    }

    public void setLastName(String lastName) {
        this.lastName = lastName;
    }

    @XmlElement(name = "dob")
    public Date getDob() {
        return dob;
    }

    public void setDob(Date dob) {
        this.dob = dob;
    }

    @Override
    public String toString() {
        return "Report [id=" + id + ", firstname=" + firstName + ", lastName=" + lastName + ", DateOfBirth=" + dob
                + "]";
    }

}

Spring Batch FieldSetMapper

A custom FieldSetMapper is needed to convert a Date. If no data type conversion is required, then only BeanWrapperFieldSetMapper should be used to map the values by name automatically. The java class which extends FieldSetMapper is ReportFieldSetMapper.

Copy Code


package com.journaldev.spring;

import java.text.ParseException;
import java.text.SimpleDateFormat;

import org.springframework.batch.item.file.mapping.FieldSetMapper;
import org.springframework.batch.item.file.transform.FieldSet;
import org.springframework.validation.BindException;

import com.journaldev.spring.model.Report;

public class ReportFieldSetMapper implements FieldSetMapper<Report> {

    private SimpleDateFormat dateFormat = new SimpleDateFormat("dd/MM/yyyy");

    public Report mapFieldSet(FieldSet fieldSet) throws BindException {

        Report report = new Report();
        report.setId(fieldSet.readInt(0));
        report.setFirstName(fieldSet.readString(1));
        report.setLastName(fieldSet.readString(2));

        // default format yyyy-MM-dd
        // fieldSet.readDate(4);
        String date = fieldSet.readString(3);
        try {
            report.setDob(dateFormat.parse(date));
        } catch (ParseException e) {
            e.printStackTrace();
        }

        return report;

    }

}

Spring Batch Item Processor

Now as defined in the job configuration an itemProcessor will be fired before itemWriter. We have created a CustomItemProcessor.java class for the same.

Copy Code


package com.journaldev.spring;

import org.springframework.batch.item.ItemProcessor;

import com.journaldev.spring.model.Report;

public class CustomItemProcessor implements ItemProcessor<Report, Report> {

    public Report process(Report item) throws Exception {
        
        System.out.println("Processing..." + item);
        String fname = item.getFirstName();
        String lname = item.getLastName();
        
        item.setFirstName(fname.toUpperCase());
        item.setLastName(lname.toUpperCase());
        return item;
    }

}

We can manipulate data in ItemProcessor implementation, as you can see that I am converting first name and last name values to upper case.

Spring Configuration Files

In our spring batch configuration file, we have imported two additional configuration files – context.xml and database.xml.

Copy Code


<beans xmlns="https://www.springframework.org/schema/beans"
    xmlns:xsi="https://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="
        https://www.springframework.org/schema/beans 
        https://www.springframework.org/schema/beans/spring-beans-4.3.xsd">

    <!-- stored job-meta in memory -->
    <!--  
    <bean id="jobRepository"
        class="org.springframework.batch.core.repository.support.MapJobRepositoryFactoryBean">
        <property name="transactionManager" ref="transactionManager" />
    </bean>
     -->
     
     <!-- stored job-meta in database -->
    <bean id="jobRepository"
        class="org.springframework.batch.core.repository.support.JobRepositoryFactoryBean">
        <property name="dataSource" ref="dataSource" />
        <property name="transactionManager" ref="transactionManager" />
        <property name="databaseType" value="mysql" />
    </bean>
    
    <bean id="transactionManager"
        class="org.springframework.batch.support.transaction.ResourcelessTransactionManager" />
     
    <bean id="jobLauncher"
        class="org.springframework.batch.core.launch.support.SimpleJobLauncher">
        <property name="jobRepository" ref="jobRepository" />
    </bean>

</beans>

jobRepository: The JobRepository is responsible for storing each Java object into its correct meta-data table for spring batch.
transactionManager: this is responsible for committing the transaction once size of commit-interval and the processed data is equal.
jobLauncher: This is the heart of spring batch. This interface contains the run method which is used to trigger the job.

Spring Batch Database Configuration

Below is the configuration for the database used in Spring Batch.

Copy Code


<beans xmlns="https://www.springframework.org/schema/beans"
    xmlns:jdbc="https://www.springframework.org/schema/jdbc" xmlns:xsi="https://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="https://www.springframework.org/schema/beans 
        https://www.springframework.org/schema/beans/spring-beans-4.3.xsd
        https://www.springframework.org/schema/jdbc 
        https://www.springframework.org/schema/jdbc/spring-jdbc-4.3.xsd">

    <!-- connect to database -->
    <bean id="dataSource"
        class="org.springframework.jdbc.datasource.DriverManagerDataSource">
        <property name="driverClassName" value="com.mysql.jdbc.Driver" />
        <property name="url" value="jdbc:mysql://localhost:3306/Test" />
        <property name="username" value="test" />
        <property name="password" value="test123" />
    </bean>

    <bean id="transactionManager"
        class="org.springframework.batch.support.transaction.ResourcelessTransactionManager" />

    <!-- create job-meta tables automatically -->
    <!-- <jdbc:initialize-database data-source="dataSource"> <jdbc:script location="org/springframework/batch/core/schema-drop-mysql.sql" 
        /> <jdbc:script location="org/springframework/batch/core/schema-mysql.sql" 
        /> </jdbc:initialize-database> -->
</beans>

Spring Batch uses some metadata tables to store batch jobs information. We can get them created from spring batch configurations but it’s advisable to do it manually by executing the SQL files, as you can see in commented code above. From security point of view, it’s better to not give DDL execution access to spring batch database user.

Spring Batch Tables

Spring Batch tables very closely match the Domain objects that represent them in Java. For example – JobInstance, JobExecution, JobParameters and StepExecution map to BATCH_JOB_INSTANCE, BATCH_JOB_EXECUTION, BATCH_JOB_EXECUTION_PARAMS and BATCH_STEP_EXECUTION respectively. ExecutionContext maps to both BATCH_JOB_EXECUTION_CONTEXT and BATCH_STEP_EXECUTION_CONTEXT. The JobRepository is responsible for saving and storing each java object into its correct table.

Below are the details of each meta-data table.

Batch_job_instance: The BATCH_JOB_INSTANCE table holds all information relevant to a JobInstance.
Batch_job_execution_params: The BATCH_JOB_EXECUTION_PARAMS table holds all information relevant to the JobParameters object.
Batch_job_execution: The BATCH_JOB_EXECUTION table holds data relevant to the JobExecution object. A new row gets added every time a Job is run.
Batch_step_execution: The BATCH_STEP_EXECUTION table holds all information relevant to the StepExecution object.
Batch_job_execution_context: The BATCH_JOB_EXECUTION_CONTEXT table holds data relevant to an Job’s ExecutionContext. There is exactly one Job ExecutionContext for every JobExecution, and it contains all of the job-level data that is needed for that particular job execution. This data typically represents the state that must be retrieved after a failure so that a JobInstance can restart from where it had failed.
Batch_step_execution_context: The BATCH_STEP_EXECUTION_CONTEXT table holds data relevant to an Step’s ExecutionContext. There is exactly one ExecutionContext for every StepExecution, and it contains all of the data that needs to persisted for a particular step execution. This data typically represents the state that must be retrieved after a failure so that a JobInstance can restart from where it failed.
Batch_job_execution_seq: This table holds the data execution sequence of job.
Batch_step_execution_seq: This table holds the data for sequence for step execution.
Batch_job_seq: This table holds the data for sequence of job in case we have multiple jobs we will get multiple rows.

Spring Batch Test Program

Our Spring Batch example project is ready, final step is to write a test class to execute it as a java program.

Copy Code


package com.journaldev.spring;

import org.springframework.batch.core.Job;
import org.springframework.batch.core.JobExecution;
import org.springframework.batch.core.JobParameters;
import org.springframework.batch.core.JobParametersBuilder;
import org.springframework.batch.core.launch.JobLauncher;
import org.springframework.context.support.ClassPathXmlApplicationContext;

public class App {
    public static void main(String[] args) {

        String[] springConfig = { "spring/batch/jobs/job-batch-demo.xml" };

        ClassPathXmlApplicationContext context = new ClassPathXmlApplicationContext(springConfig);

        JobLauncher jobLauncher = (JobLauncher) context.getBean("jobLauncher");
        Job job = (Job) context.getBean("DemoJobXMLWriter");

        JobParameters jobParameters = new JobParametersBuilder().addLong("time", System.currentTimeMillis())
                .toJobParameters();

        try {

            JobExecution execution = jobLauncher.run(job, jobParameters);
            System.out.println("Exit Status : " + execution.getStatus());

        } catch (Exception e) {
            e.printStackTrace();
        }

        System.out.println("Done");
        context.close();
    }
}

Just run above program and you will get output xml like below.

Copy Code


<?xml version="1.0" encoding="UTF-8"?><report><record id="1001"><dob>2013-07-29T00:00:00+05:30</dob><firstname>TOM</firstname><lastname>MOODY</lastname></record><record id="1002"><dob>2013-07-30T00:00:00+05:30</dob><firstname>JOHN</firstname><lastname>PARKER</lastname></record><record id="1003"><dob>2013-07-31T00:00:00+05:30</dob><firstname>HENRY</firstname><lastname>WILLIAMS</lastname></record></report>

That’s all for Spring Batch example – a Guide. Welcome

Source: digitalocean.com

Create a Free Account

Try now

Posts you might be interested in:

Moderne Hosting Services mit Cloud Server, Managed Server und skalierbarem Cloud Hosting für professionelle IT-Infrastrukturen

How to Install and Secure GoCD on CentOS 7 with SSL and Firewall

Linux Basics, Tutorial

2 weeks ago

Installing GoCD on CentOS 7 with Block Storage Configuration GoCD is a freely available automation and continuous delivery platform. It supports designing sophisticated pipelines through both sequential and concurrent task…

Install Leanote on CentOS 7 with SSL, MongoDB & Nginx

Linux Basics, Tutorial

2 weeks ago

Installing Leanote on CentOS 7 with MongoDB and Let’s Encrypt SSL Leanote is a free, lightweight, and open source note-taking platform built with Golang. Designed with a strong focus on…

Set Up a Secure Git Server with Nginx on Debian 8

Linux Basics, Tutorial

2 weeks ago

Setting Up a Secure Git Server with Nginx on Debian 8 Git is a widely used version control solution that allows developers to manage and track changes in their source…

FEATURED PRODUCTS

Kubernetes

ccloud³

Managed Server

Cloud GPU

S3 Object Storage

COMPUTE

MANAGED

STORAGE

NETWORKING

MANAGEMENT TOOLS

BACKUPS & SNAPSHOTS

WEBSITE HOSTING

HOUSING

FEATURED INDUSTRIES

Enterprise

Saas-Hosting

Startup

INDUSTRIES

MORE INDUSTRIES

FEATURED USE CASES

Linux-Hosting

VMware Migration

Docker Hosting

USE CASES

MORE USE CASES

RESSOURCES

Help Center

Trust Center

Glossar

Tutorials

MORE CENTRON

MORE INFOS

FEATURED PRODUCTS

Kubernetes

ccloud³

Managed Server

Cloud GPU

S3 Object Storage

COMPUTE

MANAGED

STORAGE

NETWORKING

MANAGEMENT TOOLS

BACKUPS & SNAPSHOTS

WEBSITE HOSTING

HOUSING

FEATURED INDUSTRIES

Enterprise

Saas-Hosting

Startup

INDUSTRIES

MORE INDUSTRIES

FEATURED USE CASES

Linux-Hosting

VMware Migration

Docker Hosting

USE CASES

MORE USE CASES

RESSOURCES

Help Center

Trust Center

Glossar

Tutorials

MORE CENTRON

MORE INFOS

Welcome to Spring Batch Example

Spring Batch Example Overview

Understanding Jobs and Steps

Implementing Spring Batch: A Working Example

Spring Batch Maven Dependencies

Spring Batch Processing CSV Input File

Spring Batch Job Configuration

Spring Batch Model Class

Spring Batch FieldSetMapper

Spring Batch Item Processor

Spring Configuration Files

Spring Batch Database Configuration

Spring Batch Tables

Spring Batch Test Program