Tuesday, February 08, 2005

De-Batching and Re-Batching the Files (Scatter-Gather)

In my current project we had requirement to split the incoming XML files into individual transactions. These individual transactions would then go through some rule validations in Orchestration. Depending on the validation rules set up in the external assemblies, these transactions will be either rejected or accepted. The business need for splitting the file came from the fact that the transaction has to be accepted or rejected as a whole.

Splitting part was easy. We used envelope and document schemas for that. But the difficult part was to batch the split transactions per incoming file they originally belonged too. I will give possible solutions for "Re-Batching" here and also limitations for each.

1) Use a loop within the Orchestration. Add each transactions on basis of node. Co-relation set will have to be set up . This is important as the correlation ID will help us to add only the transactions that belong to a particular file. This approach is useful if we have something like a "Orders" file that consists of various "Items". Order ID will be the correlation ID in this case. This will tie up all the "Items" for a particular "Order ID".

This approach fails in my situation as I have no correlation ID. My sample file consists of a root element immediately followed by child nodes. These child nodes contain all the information. The sample file looks something like:

(a)(b)contains all the data(/b)(/a)

Ideal structure to use the looping would be:

(a)correlation ID here(b)contains all the data(/b)(/a)

2) Loop it and add on basis of the count.

Problem with this approach is that in scenario of 10 input files being dropped at the same time and each having 10 each transactions. The count will not ensure that the transactions we are adding are actually coming from a single file (as we need in our case). It is not guaranteed that the Transactions in each file will be processed in a sequence.

3) Third approach works for me. If I set up a unique ID at each transaction level. This ID is same for all transactions in one particular file. Then I can equate this ID as "File Name" in the Send Port and then use "Append". Biztalk will append all the transactions with the same unique file name. This is what I need!


Varun said...

Hi Shashikanth,

Can you brief me how you created uniqueID at each transaction level.


Shashikant Raina said...

You can have it as a filename. This will ensure that only transactions belonging to one particular file are grouped together.