-
Notifications
You must be signed in to change notification settings - Fork 201
Home
Easy Batch is a framework that aims to simplify batch processing with Java. It was specifically designed for simple ETL jobs. Writing batch applications requires a lot of boilerplate code: reading, writing, filtering, parsing and validating data, logging, reporting to name a few.. The idea is to free you from these tedious tasks and let you focus on your batch application's logic.
Task | You | Easy Batch |
---|---|---|
Implement business logic | x | |
Handle resources I/O | x | |
Data filtering / validation | x | |
Type conversion | x | |
Objects marshalling / unmarshalling | x | |
Transaction management | x | |
Logging / Reporting | x | |
Job Monitoring | x |
Easy Batch jobs are simple processing pipelines. Records are read in sequence from a data source, processed in pipeline and written in batches to a data sink:
Easy Batch provides the Record
and Batch
APIs to abstract data format and process records in a consistent way regardless of the data source/sink types.
Let's suppose you have some tweets represented by a Tweet
class and you want to transform them from CSV to XML. Here is how to do it with Easy Batch:
Path inputFile = Paths.get("tweets.csv");
Path outputFile = Paths.get("tweets.xml");
Job job = new JobBuilder<String, String>()
.reader(new FlatFileRecordReader(inputFile))
.filter(new HeaderRecordFilter<>())
.mapper(new DelimitedRecordMapper<>(Tweet.class, "id", "user", "message"))
.marshaller(new XmlRecordMarshaller<>(Tweet.class))
.writer(new FileRecordWriter(outputFile))
.batchSize(10)
.build();
JobExecutor jobExecutor = new JobExecutor();
JobReport report = jobExecutor.execute(job);
jobExecutor.shutdown();
Easy Batch makes your code declarative, intuitive, easy to read, understand, test and maintain.
Easy Batch is created by Mahmoud Ben Hassine with the help of some awesome contributors
-
Introduction
-
User guide
-
Job reference
-
Component reference
-
Get involved