Reason to avoid appendToDocumentList

Whenever I see usage of appendToDocumentList in any flow service for creating output document list, I wonder if we can achieve it through some other way e.g. Output array option of loop, java service loop or putting index of incrementing variable inside loop.Also the question used to arise what are the benefits of using other methods over appendToDocumentList. There is a set of reaction which you get when one talks about the usage of appendToDocumentList e.g.

  • Performance is badly hit with appendToDocumentList.
  • Stay away from appendToDocumentList and use PSUtilities service addToList.

To find out the reason to avoid a particular service which is provided by SAG, I did a little brain storming session with the geniuses out there.

First let me put the alternates of appendToDocList.

  • Explicit Loop: If the size of output document list to be created is same as input document list use output array option in loop.
  • Sometime there are conditional mappings and output document list can be different from the input one. In that case try and use PSUtilities.list:addToList. Create a copy of this service in your package and use it.
  • Implicit Loop: for simple lists of the same size, you may want to link them directly in a MAP step and let the IS handle the looping implicitly.
  • Java Loop: Write a java service for looping logic.

Question Arises Why are we trying to avoid appendToDocumentList at all. It’s SAG provided utility and there should be some solid reason behind this. I got a very informative answer from Mr Percio (castropb) in wmusers.

Reason to avoid appendToDocumentList
Percio said – For large lists, appendToDocumentList performs poorly because of the way it is implemented. Every time you call appendToDocumentList, it basically creates a brand new list with size equal to plus 1 (assuming you’re appending one item). It then copies all the items from the original list to the new list and puts the appended item at the end of the new list. This frequent memory reallocation and copying of data is what gives you the performance hit.

When you use an output array, output array is allocated once with the same size as the input array so you don’t run into this problem.

I ran a test for a customer a few years ago to compare different methods of mapping a source list to a target list involving large lists (up to 100,000 items), and at the time, they ranked as follows from fastest to slowest:

1. Java Loop: looping done through a Java service.

2. Implicit Loop: for simple lists of the same size, you may want to link them directly in a MAP step and let the IS handle looping implicitly. Works when the variable names inside a doc list are identical.

3. Explicit Loop: using a LOOP step and its Output Array.

4. Append to Array List: similar to append to document list except that an array list is used in the background so there’s no reallocation and copying of data. It is important to set an appropriate initial size to maximize performance.

5. Append to Document List: using the WmPublic service.

6. Dynamic Index: using a LOOP step without specifying Output Array and mapping to a specific item in the output list using a variable index

Note: Time taken to copy the lists grew linearly as the size of the list grew for method 1 to 4 and grew exponentially for method 5,6.

One new question arises after reading “appendToDocumentList performs poorly for larger lists” is what is a large list? It depends on the physical resources available to the IS and complexity of your mappings, etc.

From the analysis given by Percio, it is very clear that one should avoid appendToDocumentList.

Another Perspective
I also got an interesting answer from reamon. He mentioned

“Key for all of the options listed in Percio’s excellent summary–test which approach works for your situation. Do *not* assume one approach will be faster than another.

In recent tests of my own with appendToDocumentList (though not with the number of elements Percio used) I found that performance had improved dramatically from the same tests I had done a few years ago. The JVM version and settings being used undoubtedly will have a big impact on this performance.

As Percio noted, there is no one answer. So the key is to try out different approaches until you get the performance your integration needs.”

Thank you Percio and Rob Eamon(reamon) for your valuable inputs.

Conclusion
I personally avoid appendToDocumentList because of performance and to shorten the code size.

As already said in post, before coming to conclusion of using or not using append service please make sure that you have tested all your scenarios and use the one which is best suitable for your integration.

Hope this post will help you guys. Comments/discussions are most welcome.

Appreciate if you can provide feedback in comment section or by clicking on reaction check box. Thanks


Comments

9 responses to “Reason to avoid appendToDocumentList”

  1. Hi, thanks for your sharing. May I know which version of webmethod that you mentioned? Could you explain how to create an Implicit Loop?

  2. @robizenus It’s very general not version specific. for implicit looping. you have to do a straight map on document list level. it works only if the name of variable in both input and output document list is same. e.g. if we have input doclist as A and output as B. then just map A –> B in map step. IS will take care of creating same number of indices for B

  3. hi Mangat, thanks for your sharing. For “Append to Array List”, how to do that? I cannot found Array List, there are only String List and Document List.

  4. Hello,
    Thanks for sharing this post.
    I am more concerned about the memory consumed than time taken.
    Is it possible to calculate the memory consumed by the process while performing the testing?

  5. MANGAT RAI Avatar
    MANGAT RAI

    you have to use some kind of profiler to record that.
    this is quite an old post. if you can share your requirement then probably i can provide some suggestion.

  6. I have developed a code to check the health of multiple IS from a single IS via remote invoke.
    All the servers to be monitored are defined as ‘Remote Servers’ on the monitoring IS.

    Monitoring is defined for below on the monitoring IS,

    Monitors all Integration Servers status
    Monitors all JDBC, SAP & MQ adapter connection status
    Monitors all JDBC, SAP & MQ listener status
    Monitors the queue length of all clients
    Monitors the connectivity to Broker from IS
    Monitors all broker servers
    Monitors all file polling ports
    Monitors both Memory & Thread usage on all Integration Servers
    Schedule tasks on all Integration Servers
    Monitors Broker/Local Triggers on all Integration Servers
    Monitors the trigger throttle settings on all Integration Servers

    All these monitoring activities run fine individually.
    The problem arises when these are scheduled to run at specific intervals.

    The schedule tasks run fine for couple of hours and then end up in error as below.
    java.lang.OutOfMemoryError:unable to create new native thread

    Memory initially assigned to the monitoring IS was 256.
    As I encountered this error, I changed the memory to 1024 by changing the server.bat as below. But still the error was the same. After some time IS does not respond and shuts down.

    set JAVA_MIN_MEM=1024M
    set JAVA_MAX_MEM=1024M
    set JAVA_MAX_PERM_SIZE=128M

    Thread usage was well within the limit during this process.

    Most of the above monitoring services use appendToDocumentlist.
    So I am little concerned about the memory consumed by each of the above monitoring service.

    1. MANGAT RAI Avatar
      MANGAT RAI

      to monitor memory you can use JVisualVM or Jmap
      http://docs.oracle.com/javase/7/docs/technotes/tools/share/jmap.html

  7. Additional details:

    IS version:
    7.1.2.0

    Extended Settings:
    watt.server.cronMaxThreads=5
    watt.server.cronMinThreads=2
    watt.server.scheduler.maxWait=60
    watt.server.scheduler.threadThrottle=75
    watt.server.threadPool=75
    watt.server.threadPoolMin=10

    System:
    Microsoft Windows Server 2003 R2

  8. MANGAT RAI Avatar
    MANGAT RAI

    I think we communicated through techcommunity also. I suggested you to increase thread pool from 75 to higher say 500 under Resources.

    if error is not able to assign new thread then i don’t think increasing memory will help.

    other thing you can do is keep taking thread dump at regular interval when service is running. that will give you an idea of which thread is taking longer time.

    i really have to take a look at your code to suggest more. you can contact me at email webMethods@quest4apps.com

Questions? Comments? Suggestions? Let us know!! Like / Subscribe / Follow for more updates.