zephyr: 2015

Monday, December 7, 2015

NOSQL with RDBMS fallback:

NOSQL adoption becoming prominent across different critical applications to reap the benefits of performance, fault tolerance, high availability for bigger volume database needs. While migrating to NOSQL one of the risk that architects feel is what if the application gets into some unseen issues and take more time to fix , as NOSQL adoption is not battle tested across different domain and sectors and how to design some fallback strategy.

Few factors that people may think while migrating to NOSQL

What will happen if we get into unexpected errors in production and if takes more time to fix?

What if the product vendors itself haven’t faced such scenarios?

What if I have reporting or other dependent system that are well integrated with RDBMS and difficult to migrate from RDBMS in the current phase?

Architects would like to design some fallback option as RDBMS where application can switch to RDBMS on unrecoverable NOSQL issues. This raises few questions in mind on how to design the same.

How do I sync up data both in NOSQL and RDBMS for a high data volume without losing the order of update?

How do I sync up without adding much overhead to application? Synchronous update to both NOSQL and RDBMS will be too much of overhead...

How to reliably update the data between the two systems without any loss?

What if RDBMS goes down and how can I design to sync up reliably even on failures?

I can think of design depicted below to address the same.

Few components involved in the design are Apache Kafka receiving the updates and Apache Storm process the data to update the same to RDBMS. Both of these system are designed to work for big data needs in a reliable and distributed form.

Apache Kafka is a high performance message queuing system. Application pose the messages (Insert / Update / Delete) to Kafka message queue. To improve the performance with parallel processing the queue can be partitioned by table / region / logical data design as per the NOSQL model.

Apache Storm is a real time processing engine that can consume message through Spout component, do some processing through bolts and update the data to RDBMS. Storm has topologies to process guaranteed data processing, transactional mode of commitments that makes it suitable to handle partial failures and during commitments.

Benefits:

Information can be updated in an asynchronous way to RDBMS reliably.

Apache Kafka providing high available, reliable, partitioned queuing system fits best to handle huge data volume.

Storm doing real time processing for Kafka messages on the partitioned queue and provides a reliable way to update RDBMS

On RDBMS failures Kafka will persist the messages and Storm can continue to sync up the messages when it is comes back.

Wednesday, December 2, 2015

Next Generation Enterprise Application Architecture

New generation applications are architectured not only with the goal of desiging the application functionally and performing stable but also focuses on different aspects that are becoming critical

Scalability – Elastic scalability for all the layers of the application including data tier

Fault Tolerance - Ability to handle failure smartly and avoid cascading failures from and to dependent systems

High Availability – Ability to have application highly available on all layers including database even on data center failures

Efficient utilization of Infrastructure - Ability to scale up and down on demand

Faster access to underlying data on high load and data volumes

Ability to handle different data formats efficiently

Few reasons that are tied to this evolution are the need and benefits towards cloud adoption ( could be either private or public cloud ) and the need to handle huge data volume with faster response on the data tiers

	Benefits	Solutions
Physical -> IaaS -> PaaS	Elastic Scalability High Availability Efficient Infrastructure utilization Zero downtime deployment	VMWare , Open Stack – Private Cloud IaaS AWS, Azure – Public Cloud IaaS , PaaS Cloud Foundry – PaaS on private and public cloud
Circuit Breaker	Fault Tolerance Better failure handling Avoid avalanche failures	Netflix Hystrix Polly
Service Registry	Registry for dynamic instance scaling	Netflix Eureka Apache Zookeeper
Intelligent Load balancing	Intelligent Load Balancing utilizing the elastic scaling and self-discovery	Netflix Ribbon F5 Nginix
Search	Quick search needs from huge data sets, full text search, pattern matching	Elastic Search Solr
Data Grid	Faster read write data, Reduce read / write overhead to database, high availability to data	Coherance Gemfire Membase
Queue	Reliable data transfer across different data layers	Kafka RabbitMQ JMS
NoSQL	Big data – database needs Heavy Read / Write on high data volumes Faster response needs on the data High Availability on data Fault Tolerance on data Distributed database Scalable database	Couchbase MongoDB HBase Cassandra Graph DB ( Titan, OrientDB )
Hadoop	Distributed file processing and storage ecosystem High speed batch (MapReduce) / real time ( Storm, Spark ) processing	Different Hadoop distributions like Hortonworks, CloudEra,MapR

Sunday, August 30, 2015

Kafka messaging system

Apache Kafka :

Kafka is an open source message queuing solution under Apache project, Kafka is new when compared to existing queue solutions like RabbitMQ, ActiveMQ, AWS SQS on product maturity but is quickly gaining momentum due to its features. In this post we will analyze some features of Kafka to see why it is gaining attention in the market.

The demand for processing huge data sets is growing everyday across enterprise systems and data is being processed in batch or real time and the queuing systems play an important role in connecting the data from source system / producer to destination / consumers. With huge dataset in transit enterprise are looking for message solution that can provide high throughput per second , scale horizontally, provides high availability and integrate well with other solutions.

Scalability:

This is one of feature where Kafka gets edge over other solutions, the ability to scale horizontally, Kafka achieves it by means of partitioning. We can set the number of partition while defining a topic (queue) and these partitions will get distributed across the broker nodes in the cluster and hence when we want to scale the system we can add more broker nodes and hence the partitions get realigned across the added broker nodes.

Fault Tolerance and High Availability:

Kafka achieves high availability by means of replication the partitions get replicated across different broker nodes and Kafka uses Zookeeper for its co-ordination. When a broker node goes down zookeeper co-ordinates so that the data is continued to be served from the replicated broker node partition and hence high availability for data is achieved.

Unit of Order:

Kafka guarantees unit of order delivery at each partition level and messages posted across different partitions are not guaranteed to be in order.

Reliability & Guaranteed delivery:

Kafka provides reliability to the message delivery and has options of synchronous and asynchronous acknowledgements for the message delivery.

Integration with Big Data solutions:

Kafka comes as part of Hadoop distributions and integrates with Hadoop map reduce for bulk consumption in parallel, for real time stream processing needs Kafka has good integration with systems like Apache Storm and Spark.

Reference : Kafka

Saturday, August 15, 2015

Build your own monitoring solution for couch base

Recently i was trying to build a monitoring solution for couch base , i followed a simple approach that worked out well, thought to share the same in this post.

Requirements for the solution

Simple solution that can collect metrics from the http stats endpoint of couch base

Script based solution that can customized by operation team

Visualization dashboards

No additional software installation on couch base servers

Solution Needs

An light weight app server that can collects metrics on regular interval

A persistence layer that stores the data

A visualization tool that can bind well with the persistent data

Solution Architecture

Solution Highlights

Solution collects json stats data which has thousands of metrics as a whole and stores it to elastic search

NodeJS is a light weight server based on java script

Couch base stat endpoint exposes JSON based metrics and elastic search works well with storing JSON data

Kibana provides nice visualization for elastic search through different charts

NodeJS provides built libraries to elastic search

Hosting NodeJS, elastic search, Kibana are very simple, you can setup easily all of these components in few minutes through dockers

Elastic search is highly scalable

The approach can be applied for any monitoring where metrics are exposed through json format

Please find the reference of the solution under github

Friday, January 23, 2015

Two Phase Commit

Bottlenecks in database layer

Database has been seen as a most common place of bottleneck for performance across different tiers of the application. Few possible reasons restricting RDBMS performance

RDBMS not able to scale horizontally

Locking at row level / data page level / table level during database transactions

NoSQL on the rescue

I have mentioned about NoSQL data stores in my previous blog which achieves horizontally scalability distribution . In this blog I would like cover how transactional behaviour is achieved with high performance in NoSQL

Transactions in RDBMS

Let us try to understand how transactions operates in RDBMS,transactions with ACID (Atomicity, Consistency, Isolation and Durability) compliance executes all the actions involved in the transaction in a single step, if all the actions succeeds it commits the changes otherwise all the changes are revoked. To achieve this locking happens across the tables and hence performance becomes bottleneck

Let us take a simple transaction in order placement and analyse , An simple order management transaction involves two tables involving order and billing.

Confirm the product for the order by decrementing a count in the product catalogue

Confirm billing for payment

If payment succeeds, transaction as a whole has to be committed . If payment fails for some reason , change in product catalogue has to be revoked to original state so that it is available for others to consume. RDBMS achieves this whole process as a single step by locking these tables until transaction is completed and hence gets the ability to commit or revoke at the end of transaction, but this gives a overhead in performance as these tables gets locked and any read/ write on those are kept on hold unless stale read is enabled

Restrictions with NoSQL

Let us understand restrictions in NoSQL towards achieving this type of transactions

NoSQL provides locking at row level and not across rows, tables etc

With adoption of polyghot persistence and distributed transactions we may need to perform a transaction across different datastores as well

Two Phase Commit

Two Phase Commit is an approach followed in NoSQL to achieve transaction like behaviour. As the name mentions transactions happens in two phases with the ability to commit or revoke the changes made in phase 1 during phase 2. The approach introduces a additional component transaction manager which helps to commit or roll back the changes made in each phase of the transaction

Advantages with Two Phase Commit approach

Provides high performance with transactions

Ability to retry for failure portions of the transactions ( interesting )

Provides distributed transaction like capabilities across data stores

My personal experience with Two Phase Commit

Recently I personally came to see a Two Phase Commit scenario handled in amazon for my order placement which became inspiration for this post.

I placed an order ( a laptop desk ) in Amazon where my order placement was received and I went for sleep.Looks the payment got failed for some reason. Next day morning I got a notification to retry my payment, in this case Amazon instead of revoking the order placed ,Amazon holded the order for additional time say for a day or two and provided option to retry payment failure.

My order confirmation

Payment retry for my order

I believe Amazon has implemented some form of two phase commit to achieve this, I was personally happy with the way amazon handled my payment failure as the order was not revoked and i was given retry option later to complete the order with my laptop desk was still reserved for me.

This also opens the door for other mode of payments like cash on delivery etc.

Few links on Two Phase Commit

Star Bucks Approach for Performance

MongoDB - how to perform 2PC

Saturday, January 3, 2015

NoSQL - An Introduction

NoSQL:

Not Only SQL often mentioned as NoSQL provides a mechanism to store and retrieve data not through tabular format as in relational databases.

There are different NoSQL solutions that are matured and being adopted widely Ex : Redis,Riak,HBase,Cassandra,Couchbase,MongoDB.

It is critical to understand the concepts of NoSQL why and how NoSQL has been used for a specific application architecture because every NoSQL solution is unique in its own way and different from general RDBMS solutions.

Need for NoSQL:

With the explosion of web and social interactions the volume and complexity of data has grown tremendously huge, it is the need of the hour for each applications to scale seamlessly without any compromise in performance.

If we look at RDBMS performance starts degrading at some point of data volume and complexity and applications has to think adopting various NoSQL solutions to match the growth of huge volume and complexity.

Polyglot persistence:

NoSQL Solutions has become more matured and enterprise data architects has started implementing NoSQL in their solutions giving a strong message that RDBMS is not the only solution to data needs.

Problem in data persistence are unique and each problem needs specific solution to handle the scenario better. The concept of Polyglot persistence evolved to insist that application needs to use specific persistence solution to handle specific scenarios.

The table below helps to describe some scenarios in a retail web application and how different persistence solution can help to satisfy those needs.

Scenario		Persistence solution
User Sessions		Re-dis
Financial data		RDBMS
Shopping Cart		Riak
Recommendations		Neo4j
Product Catalog		MongoDB
Analytics		Cassandra
User Activity Logs		Cassandra

[Source: http://martinfowler.com/bliki/PolyglotPersistence.html]

Coming out of relational mindset:

One of the biggest problem with the adoption of NoSQL solution is to keep the people out of relational mindset. The minds of data modeling is deeply rooted with RDBMS and relational concepts.

It will be difficult initially to conceptualize data out of relational world, but if we understand these concepts and look back at our data solutions made, many of them may not need the normalized modeling.

Data is not normalized.

Data will be duplicated.

Tables will be schema less and doesn’t follow a predefined pattern

Data can be stored in different formats like JSON, XML, audio, video etc.

Database may have some compromise on some attributes on ACID properties

Data may have some compromise on attributes like consistency.

CAP theorem:

CAP theorem defines set of basic attributes for any distributed system. Understanding the dimensions of CAP theorem helps to understand any NoSQL solution better. The below diagram describes the attributes satisfied by different distributed database system on multiple server deployment environment.

The important point to note here is that none of the distributed system can completely satisfy all the three dimensions of CAP theorem Consistency, Availability and Partition Tolerance.

Any distributed system can a maximum 2 dimensions of CAP completely, depending on the application requirement people have to choose for the specific distributed system that suits their needs.

It is critically important to understand the application requirements and understand where the specific NoSQL solution falls.

ACID Compliance:

ACID stands for Atomicity, Consistency, Isolation, Durability, these are set of properties that guarantee transactional behavior in RDBMS operations.

RDBMS concepts that focuses more on integrity, concurrency, consistency and data validity, but many of the data needs in software applications may not be interested in these aggregation, integrity and validity or can handled in upper layers.

Compromising any of these in database architecture may bring high performance and scalability that RDBMS is currently lagging.

NoSQL database for example is not strictly ACID compliance where it can compromise on one of the attributes of ACID to achieve extreme scalability and performance.

It is critically important to understand the application requirements and understand the specific NoSQL used and how the compromise is made.

BASE versus ACID:

NoSQL instead of adhering ACID compliance it tends to be BASE compliance in order to achieve scalability and high performance. The following are defined to be BASE attributes that NoSQL solution are trying to adopt

Basic Availability
Soft-state
Eventual consistency

NoSQL Categorization based on data modeling

Key Value Stores Ex : Redis, Riak, Amazon Simple DB
Column Family Stores ( Big Tables ) Ex : Cassandra , HBase
Document databases Ex : CouchDB , Couchbase , MongoDB
Graph databases Ex : Neo4j, Titan

Each of this NoSQL provide unique advantage on specific functionalities, selection of a specific NoSQL category is critical for the design of the application needs.

At high level the specific NoSQL solution can be chosen based on the complexity and querying associated with the data model.

The below diagram provides a good comparison on the different NoSQL databases.

[Source: https://highlyscalable.wordpress.com/2012/03/01/NoSQL-data-modeling-techniques/]

NoSQL based on system architecture:

Based on the system architecture, NoSQL can be categorized into the following.

P2P ( Ring Topology )
Master Slave

Each architecture has some pros and cons and a decision has to be made based on the needs.

	P2P ( Ring Topology )	Master Slave
Role	All Nodes carries equal role	Master – Slave architecture with specific responsibilities on specific nodes
Consistency	Eventual	Strong
Write/Read	Read and Write happens through all the nodes	Mostly write is driven through restricted nodes
Availability	High Availability	Availability is little compensated when master / Write node fails
Data	Data is partitioned across all nodes with replication	Data is partitioned into multiple slave nodes with replication
Examples	Cassandra, Couch base	HBase, MongoDB

Data read / writes:

The need of NoSQL type of solutions arrives when you tend to operate with huge volume of data and high requirements for performance towards read and writes.

Below are the typical use cases where NoSQL databases will be used

Scalable databases
High availability and fault tolerance
Ever growing set of data
Bulk read / write operations

Some NoSQL will be good for write intensive workloads and some are good for read intensive workloads and some are good for mixed workloads, specific analysis has to be done to decide on the NoSQL solution based on the needs.

Other important concepts that I would like to highlight specific to any NoSQL solutions:

Shrading:

Shrading is one of the important concept in NoSQL solution by which the data is partitioned horizontally across different nodes in the cluster. This means the data is split based on some logic say some a hash code and spread across different nodes.

Replication:

The data is not only partitioned by different nodes but also replicated across different cluster nodes. The replication factor will be a configuration in the solution. Replication ability gives high availability and automatic fail over when a specific node goes down.

Reference:

http://martinfowler.com/

http://highscalability.com

http://nosqlguide.com/