Sunday, March 24, 2019

Enterprise Integration - Hub & Spoke design


Various business domains in large enterprises using many applications, tools and utilities (ERP, CRM, Sales, HR, Finance etc.) poses bigger challenge to IT to support with effective collaboration across business domains, data analysis and critical data synchronization. With many such heterogeneous applications being added up over time integrating each of them point to point would increase the complexity exponentially. To reduce integration complexity, Hub and Spoke design paradigm is regarded as one of the best solution. The hub part needs to be designed to be highly concurrent, distributed, scalable, micro services oriented, cloud hosted and container orchestrated to offer all the elements of data integration like real-time data streaming, transformation, synchronization, quality, and management to ensure that information is timely, accurate, and consistent across heterogeneous applications.

Event driven design:
Spoke applications triggers messages to hub on its create or update event. Upon receiving such messages and based on the routing rules, hub decides message forwarding to corresponding spokes. Routing rules determines the message flow across spokes systems and these rules relies on trusted data sources for fetching data. The data validation, synchronization, transformation, enrichment, filtering for every message is carried out by integrating hub with trusted data sources. 

Tools & technologies for building hub using IBM public cloud : 
·       The real-time data pipeline is built using Kafka cluster that helps hub handle large amount of data and provides message reliability.

·       To simplify our concurrent code development, and avail infrastructure that allows us to scale without modifying application, Akka toolkit is used. Akka seamlessly handles the distribution of messages and communication in big scale. The other biggest advantage is that it simplifies concurrency logic which in turn improves coding efficiency.
·       Play web server provides lightweight, stateless web server framework and we have chosen this to expose APIs with minimal resource consumption.
·       Redis is used for caching and error handling.
·       Grafana and DataDog are used for monitoring infrastructure and services to update the administrator about system health.
·       Zipkin is used to trace the message flow across systems and provides real time update on the message transition.




Core components that Hub offers are:
·       API service: Exposes integration over REST to provide integration of spokes with hub. Here, interface contract is established to standardize the communication from spoke to hub. Provides HTTPS endpoint and OAuth2.0 for securing the communication.
·       Processor Service: Validates, standardize, enriches, filters and transforms the incoming data. Applies message routing rules to determine destination spoke for incoming messages.
·       Persistence Service: Provides data persistence to store every transaction that flows through the hub. DashDB is being used to store consolidated data.
·       Adapter Service: Service to consume the spoke interfaces is provided by adapter service. This generalized service gets derived by individual spoke adapter to be in compliant with its own interface.
·       Dashboard service:  Provides consolidated report on the opportunity messages. Helps support and administration staff with the message transaction details. 

Real time integration exception handling and assured delivery:
It is critical to handle extraneous conditions arising from unexpected events, invalid data and process errors at runtime. The ecosystem must adapt multi-pronged approach to handle various types of errors with assured delivery, error routing, error mapping, logging, notification, and dashboard monitors for error details. Robust retry framework is used to automatically resolve recoverable errors. Hub forks workflow process between recoverable and non-recoverable exceptions to manage them appropriately.
Circuit breaker mechanism is adopted to handle remote calls efficiently. This provides us lot of savings in terms of optimal resource usage and mitigates failure cascading across systems.





Wednesday, October 14, 2015

Unzipping archives with special character named files in JDK1.6 and earlier version



When a zip file with file names in accent characters of French had to be unzipped using a standard java.util.zip API of JDK 1.6, started getting below error to my surprise. The same program unzipped all the files successfully in Linux but in Windows, this was the issue.

[unzip] java.io.IOException: Stream closed
[unzip] at java.io.BufferedInputStream.getInIfOpen(BufferedInputStream.java:134)
[unzip] at java.io.BufferedInputStream.read1(BufferedInputStream.java:256)
[unzip] at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
[unzip] at java.security.DigestInputStream.read(DigestInputStream.java:144)

Investigating further on this, opened up new arena of encoding mystery of archives, specification, JDK bug and archiving tools.

Several archive tools including latest version of Winzip (19.x), 7z, WinRar, and Truezip provides encoding of the file names with UTF-8. However JDK 1.6 version fails to convert the Unicode to platform encoding while unzipping them. This puts us in serious trouble as we cannot upgrade to the next version of Java quickly and we have no tools that work in compatibility with Java.

Finally, found a rescuer in JAR utility that can successfully help us here. The Jar files portable across different platforms and different locale environments, seems like supporting the encoding of the entries in the file within zip itself. So used below jar command to zip the required files and this zip gets opened by JDK 1.6 with no hassles 

jar -cvf filename.zip folder1 folder2

This makes Java to understand the file encoding at ZipEntry level. This methodology breaks the whole purpose of compressing the files but it worked for my usecase where files compression and size was not a constraint. We can use this solution in Windows environment where Java with JDK 1.6 is failing to convert the encoding to native format. 


Cool quick references : 

https://marcosc.com/2008/12/zip-files-and-encoding-i-hate-you/

https://bugs.openjdk.java.net/browse/JDK-4244499

http://www.siao2.com/2008/05/13/8498184.aspx

Monday, October 5, 2015

Solution to JBoss Wildfly 8.0.0 Hanging



Ever faced Wildfly 8.0.0 server hanging while load testing it? Faced this frustration and thought of sharing the solution I found out after a long struggle.

When ten users with ramp up time of ten seconds are tried using JMeter tool, Wildfly 8.0.0 soon become unresponsive. However, the JBOSS JMX console application was still available. The log files like error.log, server.log, jboss.log and application logs has no error reported.  The JBoss server had to be restarted every time whe this issue occurred. However the issue was only reproducible when loaded using load testing tool. But not with multiple users accessing the system.

As usual the first suspect was on web server side of JBoss. That is when I have realized how Wildfly architecture entirely differs from older version of JBoss AS. Soon was digging into the “Undertow” component of it. Undertow claims to be a flexible, fast, web server written in Java that is based on the J2SE New Input Output (NIO) API. It is crucial to understand the blocking and non blocking APIs which formed the base of all this. Other than that there is fair amount of understanding on how java NIO works is also required as Undertow uses XNIO, a simplified low-level I/O layer.

Now let us look at how to resolve the hanging issue from the JBoss console. The JBoss console would be available even if the Jboss server hangs and it can be accessible usually by http://<host>:<port>/console.

The configuration required in this scenario would be from Profile tab of the console.

Core Configuration:

Undertow handlers create Undertow XNIO thread which drives the web request. Insufficient amount of IO threads would result in bottleneck at the webserver end and the requests fails to reach the deployed applications. Following that, the "Task-core -threads" and the "Task-max-threads" are used to determine if the request is served or if it is going to be discarded. 

Undertow threads are configured through the IO subsystem. With WildFly 8, a Worker element needs to be created in the IO subsystem, which is a kind of Thread Executor and helps to tune the Web server Thread pool.

Expand the Core option from left menu tree and select IO and select the Default handler from Worker tab. I found in my configuration, the IO thread count was mere 3 and I had to increase it to 50. This is an important tuning parameter which needs to be increased for Web applications which experience a high traffic. Also, increased "Task keepalive" time to 6 60 seconds to wait for the next request from the same client on the same connection.  This would enable browser to eliminate a full round trip for every request after the first, usually reducing the full page load time. "Task-max-threads" is increased to 60 and "Task-core -threads" are increased to 10. This helps in handling concurrent requests.




Undertow IO Buffer configuration need to be verified to optimize the usage of buffer pool.





Web Configuration:

Once core IO is configured, the same worker need to used with HTTP request. From the left tree menu expand the Web subsystem and select the HTTP option. Click on the “Default” element contained in the table and check that the Worker element is associated with your IO Worker. The worker needs to be enabled as well in order to be usable. 




After the above changes, JBoss server needs to be restarted. Note that these changes would also reflect in standalone.xml.

Now, with all these settings, the server would be ready to take the high HTTP load and with this setting I could load the system for longer time without experiencing hanging.

Sunday, April 19, 2015

Monitoring JBoss running as service from JVisualVM

Monitoring JBoss 4.x  that runs as a service in Windows server was not straight forward when I had to do that. There were some JVM settings required in order to monitor the server successfully. 

JVisualVM being the free tool, provides enough information on heaps, threads, CPU time and GC details. I found this an easy tool for performance monitoring in development and test setup. This tools is available with Hotspot JVM package (jVisualVM.exe inside bin folder) and can be used to monitor application performance. There are no settings required either at JVM end or in this tool. 

The settings required are at JBoss end and here is the step-by-step instruction on the same.

When the JBoss runs as a service changes need to be made in Windows Registry. Otherwise the below JVM parameters can be set in run.bat itself using JVM_OPTIONS environment variable. 

1. Open registry by entering "regedit" in command prompt. Found he parameters set in Registry at below location in my case.  Find out the JBoss setting location in your case first. 

HKEY_LOCAL_MACHINE
                                              SYSTEM
                                                             CurrentControlSet
                                                                                         Services
                                                                                                      MyJbossd
                                                                                                                      Parameters

2. Open "Parameters" section and add below parameters to existing setting

JVM Option Number 11=-Djavax.management.builder.initial=org.jboss.system.server.jmx.MBeanServerBuilderImpl
JVM Option Number 12=-Djboss.platform.mbeanserver
JVM Option Number 13=-Dcom.sun.management.jmxremote
JVM Option Number 14=-Dcom.sun.management.jmxremote.port=8077
JVM Option Number 15=-Dcom.sun.management.jmxremote.ssl=false
JVM Option Number 16=-Dcom.sun.management.jmxremote.authenticate=false 

//For remote debugging 
JVM Option Number 17=-Dcom.sun.management.jmxremote.local.only=false 
JVM Option Number 18=-Dcom.sun.management.jmxremote=true 
JVM Option Number 19=-Djava.rmi.server.hostname=0.0.0.0


3. Restart the JBoss from services.

4. Open VisualVM console. 

5. Right click on "localhost" and select "Add JMX"

6. Enter host/IP in case of remote monitoring and the port in this case is 8077. It should be something like below:

<IP>:8077

7. Successful connection would start the JBoss server monitoring.


Hope this simple steps helps. Happy monitoring!









Wednesday, November 6, 2013

Typical issue of thread context class loader in OSGi


Recently got an opportunity to look at a problem that one teammate was facing in thread context class loader with embedded Felix container. The issue was with thread context class loader that was working in one bundle and was failing in another bundle. This happens during the start up of web application which has embedded Felix container. After start up, if the failed bundle is restarted alone, then it works fine. Thought there was this workaround to start the bundle alone after application starts up, automation required more work. So worked for couple of hours to fix this issue. 

To explain in more detail of this problem, here is the illustration:

There are two bundles in the OSGi container:

Bundle A
Bundle B depends on A

Bundle A has below one line code and it was not setting back to the original class loader after changing it:

Thread.currentThread().setContextClassLoader(getClass().getClassLoader());


Same thread which invokes A was calling B while OSGi container start up and as B also needs to use class loader logic as A, it was failing to get the class loader object.

This was breaking B from achieving its functionality. Debugging showed the class loader gets just the reference of A bundle and returns nothing other than only the bundle ID. So had to take a look at bundle A to fix the issue in B.


Solution
The simple solution in this context is like below which makes sure to set the original context class loader back in bundle A

ClassLoader orignalCL = Thread.currentThread().getContextClassLoader();
try{
Thread.currentThread().setContextClassLoader(getClass().getClassLoader());
//some logic
}finally{
logger.info("Back to original class loader");
Thread.currentThread().setContextClassLoader(orignalCL);
}

Output Before
CL.toString() : 91.0 -> which is nothing but ID of the A bundle.

Output after fix :
CL.toString()  :
com.ibm.ws.classloader.CompoundClassLoader@790079[war:ear_name/war_name] Local ClassPath: ....


The learning from this is that never forget to re-set to original context class loader when you done with logic. If the same thread later calls some other code that relies on the context class loader and that code is not written to properly set the context class loader, then that code will fail majestically. The debugging such errors is of course time consuming.

Sunday, April 28, 2013

Capacity Planning for J2EE applications

As per Wiki, Capacity planning is the process of determining the production/serving capacity needed by an organization to meet changing requirements for its products/solutions.  This process strategically assess requirements for new solution, additional network capacity and underlying IT architectures. The information provided by capacity planning helps to : 
  • Characterize the solution workloads more accurately
  • Analyze the performance of various modules
  • Model contention for application servers, and ensure scalability
  • Model and plan for communications infrastructure
  • Forecast and cope with peak demand
  • Project the impact of agent technologies and non-PC devices.
Capacity planning is more cost-effective and efficient if done prior to deployment. Performance problems resulting from a lack of capacity are more complex and costly to resolve after deployment. However, in post deployment scenarios, it is possible to identify the impact on:
  • Code changes required
  • Existing Java Heap footprint
  • Native Heap footprint
  • CPU utilization or other negative side effect
Benefits of this exercise:
  • Avoid losing customers due to site crashes
  • Performance modeling and capacity planning for infrastructure
  • Build and analyze customer behavior models
  • Plan to avoid frequent upgrades and migrations.
  • Identify potential bottlenecks in the architecture
There are various methodologies and proven theories available to conduct this exercise. Some of them are:
  • Discrete-event simulation, 
  • Mean value analysis of product-form net-works, 
  • Analytical identification of bottleneck resources in multiclass environments, and
  • Workload characterization with fuzzy clustering.
Above methodologies in detail are complex and out of scope to discuss here. Instead let us look at the easier way to understand the whole process.  

The first and foremost step here is  to understand the IT requirements as explained below.
 
Data Gathering :
To properly determine resource requirements for an application, it requires architectural information, along with a functional description of anticipated usage. The completeness and accuracy of the sizing depends on the quality of the information received. When portions of  information are unknown or missing, the risk factor for incorrect sizing increases. Following type of information are required to drive the capacity planning analysis:
  • Percentage of new function supplied by solution
  • Percentage of new data elements created
  • Business-use scenarios
  • Data transferred to and fro
  • Peak load users size
  • Solution architecture definition:
    • Business function, scenarios and supporting models
    • Data architecture, solution architecture — business and deployment architecture diagrams
    • Technical architecture and schematic (client, server, network, Web)
Once the information is collected in standard templates,  it is time to apply right calculations considering each and every criteria. When standard benchmark data is available, the analysis need to be performed and results gets documented.

The components that need to be considered or validated are given below. This might not be the complete list and depending upon the IT system in design it could differ.  
  1. Operating systems
  2. Application servers
  3. Network protocols
  4. Data access services:DB systems
  5. Programming languages
  6. UI/Client Frameworks: AJAX, Java scripts etc.
  7. Distribution services: NFS, DFS, Kerberos
  8. Systems management: SNMP, AntiVirus, ADSM, TME
  9. Application interface with legacy data/systems
  10. Peak load: Data throughput.
 Afterthe initial assessment of architecture sizing has been completed, it is time for coding and application development. Once the development is over and solution is in deployable stage, test-based sizing process can be started. This process provides the validation sizing analysis that need to be performed and results are documented in this stage.

Hope this helps to start with and I'm thinking to post some case studies and sample reports in my next article.