My Learning

Friday, February 14, 2020

Quick Commands to explore GitHub

Trends of Technology

This article was drafted around Mid 2014

M/Analytics - Data & Process Off-load, Data Lake enablement
Cloud - Application Migration to public Cloud (AWS), Docker & PaaS
Automation - DevOps, Machine Learning
Architecture - Microservices, API Management
Simplification - Application Right Platforming, Application Portfolio Rationalization (Run to Kill), Legacy/SOR Modernization
Mobility
IoTSolutions, POCs, Accelerators, Frameworks, White Papers, Technology Updates, Technology Research, Case Examples and/or Success Stories.

Thursday, February 20, 2014

Ecosystem of NoSQL

I like to share a nice writeup on No SQL compiled by one of my peer. Madhu Sudhana Valeti Just want to share the theorem behind the NoSQL (Not only SQL) systems. Basically we need to understand what good NoSQL brings to the table where SQL can’t. What is NoSQL ?

NoSQL is a non-relational database management systems, different from traditional relational database management systems in some significant ways. It is designed for distributed data stores where very large scale of data storing needs (for example Google or Facebook which collects terabits of data every day for their users). These type of data storing may not require fixed schema, avoid join operations and typically scale horizontally. Google is using BigTable (NoSQL column-oriented/tabular system).

Why NoSQL ?

In today’s time data is becoming easier to access and capture through third parties such as Facebook, Google+ and others. Personal user information, social graphs, geo location data, user-generated content and machine logging data are just a few examples where the data has been increasing exponentially. To avail the above service properly, it is required to process huge amount of data. Which SQL databases were never designed. The evolution of NoSql databases is to handle these huge data properly.

CAP Theorem (Brewer’s Theorem) There are so many NoSQL systems these days that it’s hard to get a quick overview of the major trade-offs involved when evaluating relational and non-relational systems in non-single-server environments.

As you can see, there are three primary concerns you must balance when choosing a data management system: consistency, availability, and partition tolerance. • Consistency means that each client always has the same view of the data. • Availability means that all clients can always read and write. • Partition tolerance means that the system works well across physical network partitions.

According to the CAP Theorem, you can only pick two. So how does this all relate to NoSQL systems? One of the primary goals of NoSQL systems is to bolster horizontal scalability. To scale horizontally, you need strong network partition tolerance which requires giving up either consistency or availability. NoSQL systems typically accomplish this by relaxing relational abilities and/or loosening transactional semantics. One of the primary goals of NoSQL systems is to bolster horizontal scalability. To scale horizontally, you need strong network partition tolerance which requires giving up either consistency or availability. NoSQL systems typically accomplish this by relaxing relational abilities and/or loosening transactional semantics.

In addition to CAP configurations, another significant way data management systems vary is by the data model they use: relational, key-value, column-oriented, or document-oriented (there are others, but these are the main ones).

• Relational systems are the databases we’ve been using for a while now. RDBMSs and systems that support ACIDity and joins are considered relational. • Key-value systems basically support get, put, and delete operations based on a primary key. • Column-oriented systems still use tables but have no joins (joins must be handled within your application). Obviously, they store data by column as opposed to traditional row-oriented databases. This makes aggregations much easier. • Document-oriented systems store structured “documents” such as JSON or XML but have no joins (joins must be handled within your application). It’s very easy to map data from object-oriented software to these systems.

Now for the particulars of each CAP configuration and the systems that use each configuration: Consistent, Available (CA) Systems have trouble with partitions and typically deal with it with replication. Examples of CA systems include:

• Traditional RDBMSs like Postgres, MySQL, etc (relational) • Vertica (column-oriented) • Aster Data (relational) • Greenplum (relational) Consistent, Partition-Tolerant (CP) Systems have trouble with availability while keeping data consistent across partitioned nodes. Examples of CP systems include: • BigTable (column-oriented/tabular) • Hypertable (column-oriented/tabular) • HBase (column-oriented/tabular) • MongoDB (document-oriented) • Terrastore (document-oriented) • Redis (key-value) • Scalaris (key-value) • MemcacheDB (key-value) • Berkeley DB (key-value) Available, Partition-Tolerant (AP) Systems achieve “eventual consistency” through replication and verification. Examples of AP systems include: • Dynamo (key-value) • Voldemort (key-value) • Tokyo Cabinet (key-value) • KAI (key-value) • Cassandra (column-oriented/tabular) • CouchDB (document-oriented) • SimpleDB (document-oriented) • Riak (document-oriented)

Thursday, May 17, 2012

RowIdTableDecorator for Display Tag

/**
* RowIdDisplayTagDecorator.java

* Creation Date: May 16, 2012
*/
package com.kalai;

import java.io.InputStream;
import java.util.Properties;

import org.apache.commons.lang.StringUtils;
import org.apache.log4j.Logger;
import org.displaytag.decorator.TableDecorator;

/**
* RowIdDisplayTagDecorator is an extension of displaytag library's TableDecorator. When DisplayTag table in jsp makes use of this decorator, generates the row id
* appending the columns that are configured on the displaytaguniquecolumns.properties with property key columns and values separated by ~ . <br>
* Examples <br>
* [displaytaguniquecolumns.properties - columns=column1~column2]<br>
* [Lets say the column values for column1 and column2 are col1value and col2value respectively, than the row generated for the table looks like <tr
* id="rowcol1valuecol2value"> ] <br>
* [if configured columns are not evaluated, it generates row id with index <tr id="row0">, <tr id="row1">]
*
* @author (Kalai)
*
*/
public class RowIdDisplayTagDecorator extends TableDecorator
{
private static String[] iUniqueColumns = null;
private static Logger sLogger = Logger.getLogger(RowIdDisplayTagDecorator.class);

/**
   * Loads the properties file that has the unique columns that makes a row uniqueness on the html table during inspection of the table
   */
static
{
    Properties displayUniqueColumnProperties = new Properties();
    try
    {

      InputStream propertiesFileStream = RowIdDisplayTagDecorator.class.getClassLoader().getResourceAsStream("displaytaguniquecolumns.properties");
      displayUniqueColumnProperties.load(propertiesFileStream);
      String columnvalues = displayUniqueColumnProperties.getProperty("columns");
      if (StringUtils.isNotBlank(columnvalues))
      {
        iUniqueColumns = columnvalues.split("~");
      }

    }
    catch (Exception exception)
    {
      sLogger.error(exception.toString());
    }

}

@Override
/**
   * Adds the row id the the current row object using the columns configured to identify row uniquely to enable html table inspection based on row id
   * If no columns configured, it generates id indexed from 0 - size of the table rows
   */
public String addRowId()
{
    StringBuilder builder = new StringBuilder("row");
    try
    {
      for (String column : iUniqueColumns)
      {
        builder.append(evaluate(column).toString());
      }

    }
    catch (Exception exception)
    {
      sLogger.error(exception.toString());
      builder.append(Integer.toString(getListIndex()));
    }
    return builder.toString();
}

}

Wednesday, February 22, 2012

Checking Java Heap and Thread Dumps in Unix / Solaris

Heap Dump

JAVA_HOME/bin/jmap -heap:format=b ******

where ****** is process id
Attaching to process ID *****, please wait...
Debugger attached successfully.
Server compiler detected.
JVM version is 1.5.0_12-b04
Free chunk in heap, size=131072
Free chunk in heap, size=792741552
Free chunk in heap, size=148364752
Unknown oop at 0x00002aaafab6b6a8
Oop's klass is 0x00002aab06801de8
Finding object size using Printezis bits and skipping over...
heap written to heap.bin
It generates file “heap.bin”

Thread Dump
kill -QUIT ******

Thread dump appends info to already existing stdout logs usually in tomcat server

Heap Dump : Another way

/usr/java/jdk1.5.0_12/bin/jmap -dump:format=b,file=heap.bin <pid>

The –heap option below prints a heap summary. Not sure we want that in combination with the format=b. When I ran the command below locally on my eclipse instance, it took about 5 mins to generate the heap. When I used the command above it only took about 5 seconds. Both heap dumps created seem valid, but according to the Usage guide –heap doesn’t have options with it, so not sure what the command is doing and the fact that it is taking a long time may be why you saw it puke.

Anyone else have some feedback on the command?

Usage:
    jmap [option] <pid>
        (to connect to running process)
    jmap [option] <executable <core>
        (to connect to a core file)
    jmap [option] [server_id@]<remote server IP or hostname>
        (to connect to remote debug server)

where <option> is one of:
    <none>               to print same info as Solaris pmap
    -heap                to print java heap summary
    -histo[:live]        to print histogram of java object heap; if the "live"
                         suboption is specified, only count live objects
    -permstat            to print permanent generation statistics
    -finalizerinfo       to print information on objects awaiting finalization
    -dump:<dump-options> to dump java heap in hprof binary format
                         dump-options:
                           live         dump only live objects; if not specified,
                                        all objects in the heap are dumped.
                           format=b     binary format
                           file=<file> dump heap to <file>
                         Example: jmap -dump:live,format=b,file=heap.bin <pid>
    -F                   force. Use with -dump:<dump-options> <pid> or -histo
                         to force a heap dump or histogram when <pid> does not
                         respond. The "live" suboption is not supported
                         in this mode.
    -h | -help           to print this help message
    -J<flag>             to pass <flag> directly to the runtime system

Wednesday, February 1, 2012

J2EE Connection Pooling

Why its so important?

Good managed connection pooled system are more scalable and reliable.

What level of an Enterprise application can this be applied?

Generally enterprise applications are n-tiered (UI Layer, middleware layer, Persistance Layer). Middleware layer can be attributed to a EJB, JMS/QUEUING Layer etc. Applying/Managing connection pooling in a more managed way at all these layers establishes a well scaled/ more reliable application.

How can it be achieved?
There are connection pooling factory mechanisms provided by the vendor (IBM, Oracle...)implementations for the products that integrate together for an enterprising solution.

Here is a very good article.
Dive into connection pooling for a J2EE Enterprise application

Sunday, January 29, 2012

Spring Framework

Looking for a good source of information on Spring Framework. Here's the list