I have written a customer split method to process an incoming message body then split the payload into fixed sized byte arrays. In attempt to process large incoming files, I implemented the streaming option of the Splitter EIP which did not work with my custom splitter.
Yes I am returning a List<Message>. I was able to get this to work in that I was relying on the Splitter EIP to set property "CamelSplitSize", but this is not set in streaming mode. I recalculated the CamelSplitSize in my split method and everything seemed to work. The "streaming mode" appears to be transparent.
According to "Camel In Action" (very nice book), it states that using streaming mode tells Camel to not load the entire payload into memory, but instead iterate the payload in a streaming fashion. In the case of a GenericFileMessage, of which the body is being split, does Camel read in the entire file contents into memory or only a portion at a time? How does it determine how much to read from the stream?
In this example, when is the file payload actually read, on the <from> or delegated to <split>? I need to confirm that large incoming files are not entirely read into memory, but streamed to be read in a part at a time.
In Section 8.1.3-"Splitting big messages", it briefly discusses using streaming with the splitter in context of the tokenizer, which only processes strings. It states that tokenizer uses java.util.Scanner to read chunks of data into memory. So the Scanner reads a "stream" of data, based on a token. The Scanner object is then passed back to Camel as an Expression, which Camel then uses to iterate over and deliver as parts. Is this correct?
I am in need of a similar implementation but for binary streams. Instead of getting the message body, i.e. splitBody(@Body byte body), I need to get the Exchange, i.e. splitBody(Exchange exchange). From the Exchange I could get the file and perform custom streaming to then create my Iterator. Does this sound feasible. How do I get the Exchange passed to the custom split method?
Do you know of any implementations such as this? Or would you recommend another approach of reading large binary files? Does the Stream component (ex. <from uri=stream:file:incoming>) provide any value here to be able to stream the file that I then can read as streaming input?
If you use a custom Iterator to read the file piece by piece then that works well with the streaming mode of the Camel Splitter EIP.
This is similar to as how the Scanner works. If you just want to split your file by a special token etc then the Scanner is a great choice. And Camel DSL have syntax sugar for this using the tokenizer.
If you need something more advanced/custom, then just use a custom Iterator. For example you can read in X number of bytes at each iteration step.
Thanks for your quick responses. I still am unclear on how the split method can get the file object from the exchange, so that it can then be converted into Interator. How do I get the File reference from within the split method?
Another informative chapter in the book! I did implement an Iterator that accesses a file via a fileinputstream and implements reading the file a chunk at a time. I used the bean parameter binding access the exchange and it's members. Thanks again for your assistance.