How can I flush data from a datafilewriter based on size/number of records ?. Does a similar functionality exists today ?. If not does it make sense to write one . This is more related to a batch execution where you close the files before jvm exits and can lead to lot of small files on the destination file system .Now if your fs is something like hdfs it could lead to small files problem . I could of course do a merge of these files at intervals but want to see if there is a better solution.My plan is to have the batch changed to something like a listener that listens to an input directory and then keep the datafile writer running .Do a flush at a pre configured interval based on time/size/no of records.