HadoopJavaRiak

Riak Tutorials

Riak is an open source, distributed and NOSQL key-value database.

What is Riak?

  • Open Source database
  • Highly Scalable
  • Distributed
  • Highly available
  • Eventually/Strongly consistent
  • Fault Tolerant
  • Simple to operate
  • Schema free
  • Low Latency

Where to use Riak?

  • Huge volume of data
  • Low Latency
  • High velocity Read and writes
  • If you need a highly available and consistent application
  • Expected explosive growth in future

Where not to use Riak?

  • If you have lot of transactional data
  • Riaks data model consists of keys and values as atomic element means the data must be denormalized , if your data can not be effectively managed as keys and values then it is not a best fit for you
  • Riak recommends 5 data servers in a cluster so it is not a good fit for small database

How does a Riak Cluster work?

Cluster is database in NOSQL and it is a group of nodes.

Riak Node:

Each node in Riak cluster is equivalent and contains a copy of whole Riak package.

Riak automatically redistributes data when new machine is added. when you add or remove machines data is rebalanced without impacting application. Riak cluster uses gossip protocol to route requests among all nodes.

Consistent Hashing:

Data is distributed across nodes using consistent hashing. Riak uses consistent Hashing mechanism to ensure data is distributed evenly across cluster.

Intelligent Replication:

Riaks Replication mechanism provides the ability to read,write and update data in case of any node or hardware failures or partitions.It allows you to set a replication variable i.e. n_val, there you can define the replication factor for Riak.

Hinted Handoff:

This is a mechanism for Riak to handle node failures. If a Riak node goes down then the neighboring node would take over the storage operations and when the failed node returns, all the updates are handed back to it. This happens automatically. it minimizes the failure conditions.

Version Conflicts:

When client makes a read request to Riak cluster, Riak looks up all replicas to return most recently updated version by looking at the vector clock. Also clients are allowed to do manually resolve conflicts.

Also Riak provides convergent replicated data types to handle merge conflicts automatically.

Vector Clock:

Vector clocks are the meta data attached to each replica when created. they are extended each time the replica is updated to keep track of versions.

Read Repair:

When an outdated replica is returned in a client read request, Riak will automatically update the outdated replica with latest information and sync it with other replicas. This will update a replica that returns a not_found event in case of physical failures.

© 2015, www.techkatak.com. All rights reserved.