Lessons from API integration

A general transition has been happening with the nature of work that I do in the integration space … we have been doing less of SOAP/XML/RPC webservices and more RESTful APIs for “digital enablement” of enterprise services .  This brought a paradigm shift and valuable lessons were learnt (rightly or wrongly) … and of course the process of learning and comparing never stops!

Below are my observations …

It is not about SOAP vs REST … it is about products that we integrate, the processes we use and the stories we tell as we use one architectural style vs the other

  1. Goals: Both architectural styles work to achieve the same goal – integration. The key differences lie in where they are used. SOAP/XML has seen some b2b but adoption by web/mobile clients is low
  2. Scale: SOAP/XML is good but will always remain enterprise scale … REST/JSON is webscale. What do I mean by that? REST over HTTP with JSON feels easier to communicate, understand and implement.
  3. Protocol: The HTTP protocol is used by REST better than SOAP, so the former wins in making it easy to adopt your service
  4. Products: The big monolith integration product vendors are now selling “API Gateways” (similar to a “SOA product suite”) … The gateway should be a lightweight policy layer, IMHO, and traditional vendors like to sell “app server” licenses which will blow up an API gateway product (buyer beware!)
  5. Contract: RAML/YAML is a lot easier to read than a WSDL ….which is why contract first works better when doing REST/JSON
  6. Process: The biggest paradigm change we have experienced is doing Contract Driven Development  ny writing API definition in RAML / YAML … Compare this to generating wsdl or RAML from services. Contract driven , as I am learning, is much more collaborative!
  7. Schemas: XML schemas were great until they started describing restrictions … REST with JSON schema is good but may be repeating similar problems at Web scale
  8. Security: I have used OAuth for APIs and watched them enable token authentication and authorization easily with clients outside the enterprise – we are replacing traditional web session ids with oauth tokens to enable faster integration through 3rs party clients … all this through simple configuration in an API gateway compared to some of the struggle with SAML setup within an organisation with the heavy monolithic SOA and security products!

It would be easy to then conclude that we ought to rip out all our existing SOAP/XML integrations and replace them with REST APIs no? Well not quite…as always “horses for the courses”.

Enterprise grade integration may require features currently missing in REST/JSON (WS-* , RPC) and legacy systems may not be equipped to do non XML integration

My goal was to show you my experience in integrating systems using API with Contract driven development & gateways with policies vs traditional web service development using SOA products …hope to hear about your experience, how do you like APIs?

Java Application Memory Usage and Analysis

The Java Virtual Machine (JVM) runs standalone applications and many key enterprise applications like monolithic application servers, API Gateways and microservices. Understanding an tuning an application begins with understanding the technology running it. Here is a quick overview of the JVM Memory management

JVM Memory:

  • Stack and Heap form the memory used by a Java Application
  • The execution thread uses the Stack – it starts with the ‘main’ function and the functions it calls along with primitives they create and the references to objects in these functions
  • All the objects live in the Heap – the heap is bigger
  • Stack memory management (cleaning old unused stuff) is done using a Last In First Out (LIFO) strategy
  • Heap memory management is more complex since this is where objects are created and cleaning them up requires care
  • You use command line (CLI) arguments to control sizes and algorithms for managing the Java Memory

Java Memory Management:

  • Memory management is the process of cleaning up unused items in Stack and Heap
  • Not cleaning up will eventually halt the application as the fixed memory is used up and an out-of-memory or a stack-overflow exception occurs
  • The process of cleaning JVM memory is called “Garbage Collection” a.k.a “GC”
  • The Stack is managed using a simple LIFO strategy
  • The Heap is managed using one or more algorithms which can be specified by Command Line arguments
  • The Heap Collection Algorithms include Serial, New Par, Parallel Scavenge, CMS, Serial Old (MSC), Parallel Old and G1GC

I like to use the “Fridge” analogy – Think about how leftovers go into the fridge and how last weeks leftovers, fresh veggies and that weird stuff from long long ago gets cleaned out … what is your strategy? JVM Garbage Collection Algorithms follow a similar concept while working with a few more constraints (how do you clean the fridge while your roommate/partner/husband/wife is/are taking food out?)

 

Java Heap Memory:

  • The GC Algorithms divide the heap into age based partitions
  • The process of cleanup or collection uses age as a factor
  • The two main partitions are “Young or New Generation” and “Old or Tenured Generation” space
  • The “Young or New Generation” space is further divided into 2 Survivor partitions
  • Each survivor partition is further divided into multiple sections based on command line arguments
  • We can control the GC Algorithm performance based on our use-case and performance requirements through command line arguments that specify the size of the Young, Old and Survivor spaces.

For example, consider applications that do not have long lived objects – Stateless web applications, API Gateway products e.t.c … these need larger Young or New Generation space with a strategy to age objects slowly (long tenuring). These would use either the NewPar GC or CMS algorithm to do Garbage Collection – if there are more than 2 CPU cores available (i.e. extra threads for the collector) then the application can benefit more with the CMS algorithm

The Picture Below give a view of how the Heap section of the Java Application’s memory is divided. There are partitions based on the age of “Objects” and stuff that is very old and unused eventually gets cleaned up

JVM Heap.png

The Heap and Stack Memory can be viewed at runtime using various tools like Oracle’s JRockit Mission Control (now part of the JDK), we can also do very long term analysis of the memory and the garbage collection process using Garbage Collection (GC) logs and free tools to parse GC logs

One of the key resources to analyse in the JVM is the Memory usage and Garbage Collection. The process of cleaning un-used objects from the JVM memory is called “Garbage Collection (GC)”, details of how this works is provided here: Java Garbage Collectors
Tools:

Oracle JRockit Mission Control is now free with the JDK!

  • OSX : “/Library/Java/JavaVirtualMachines/{JDK}/Contents/Home/bin/”
  • Windows: “JDK_HOME/bin/”

GC Viewer tool can be downloaded from here

Memory Analysis Tools:

  • Runtime memory analysis with Oracle JRockit Mission Control (JMRC)
  • Garbage Collection (GC) logs and “GC Viewer Tool” analyser tool

Oracle JRockit Mission Control (JMRC)

  • Available for SPARC, Linux and Windows only
  • Download here: http://www.oracle.com/technetwork/java/javase/downloads/java-archive-downloads-jrockit-2192437.html
  • Usage Examples:
  • High Level CPU  and Heap usage, along with details on memory Fragmentation
  • Detailed view of Heap – Tenured, Young Eden, Young Survivor etc
  • The “Flight Recorder” tool helps start a recording session over a duration and is used for deep analysis at a later time
  • Thread Level Socket Read/Write details
  • GC Pause – Young and Tenured map
  • Detailed Thread analysis

 

Garbage Collection log analysis

  1. Problem: Often it is not possible to run a profiler at runtime because
    1. Running on the same host uses up resources
    2. Running on a remote resource produces JMX connectivity issues due to firewall etc
  2. Solution:
    1. Enable GC Logging using the following JVM command line arguments
      -XX:+PrintGCDetails
      -XX:+PrintGCDateStamps
      -XX:+PrintTenuringDistribution
      -Xloggc:%GC_LOG_HOME%/logs/gc.log
    2. Ensure the GC_LOG_HOME is not on a Remote host (i.e. no overhead in writing GC logs
    3. Analyse logs using a tool
      1. Example: GC Viewer https://github.com/chewiebug/GCViewer
  1. Using the GC Viewer Tool to analyse GC Logs
    1. Download the tool
      1. https://github.com/chewiebug/GCViewer
    2. Import a GC log file
    3. View Summary
      1. Gives a summary of the Garbage Collection
      2. Total number of concurrent collections, minor and major collections
      3. Usage Scenario:
        ” Your application runs fine but occasionally you are seeing issues with slowed performance … it could be possible that there is a Serial Garbage Collector running which stops all processing (threads) and cleans memory.  Use the summary view GC Viewer to look for Long pauses in the Full GC Pause”
    4. View Detailed Heap Info
      1. Gives a view of the growing heap sizes in Young and Old generation over time (horizontal axis)
      2. Moving the slide bar moves time along and the “zig-zag” patters rise and fall with rising memory usage and clearing
      3. Vertical lines show instances when “collections” of the “garbage” take place
        1. Too many close lines indicate frequent collection (and interruption to the application if collections are not parallel) – bad, this means frequent minor collections
        2. Tall and Thick lines indicate longer collection (usually Full GC) – bad, this means longer full GC pauses