Big Data is an extremely large data set consisting of both structured and unstructured data. It has high volume, velocity and variety information assets that demand cost-effective, innovative forms of information processing for enhanced insight and decision making. These days the amount of data has increased in a huge amount, which has formed this Big Data.
The data in an organization these days are is growing beyond MBs, or TBs, or GBs. The point is about, what if the data is growing more than that is present today. The future may see the data growing to about Petabyte(1024 TB) or Exabyte(1024 PB) or Zettabyte(1024 EB) or Yottabyte(1024 ZB). Some of the data facts are that the New York stock exchange generates about 1 Terabyte of data per day, Facebook hosts approximately 10 billion photos, making up to about one Petabyte of data, 8 TB of data is generated by Twitter.
To most the Big Data is not just about the size of the data, but also the 4V’s of data. While the volume of data is one of the aspects of the Big Data, it also includes the variety of data and the velocity at which the data is coming into the organization and also about the value of the data to the organization. The value is very important for any organization. Hence it is actually summarized as 4V’s of data : Volume, Velocity, Variety, Value.
Processing of this Big Data needs a platform to organize the data and process it properly. Hadoop is that platform which helps in organizing and processing the Big Data, which has it’s origins in Apache Nutch.
At RailsCarma we have been successfully using Hadoop to organize, large volumes of data for our clients and we would be sharing our experiences and learning’s in our upcoming blogs. So keep tuned.
Manasa Heggere
Senior Ruby on Rails Developer