Big Data is a collection of data that is huge in volume, yet growing exponentially with time. It is a data with so large size and complexity that none of traditional data management tools can store it or process it efficiently. Big data is also data but with huge size.
Big data is a combination of structured, semi structured and unstructured data collected by organizations that can be mined for information and used in machine learning projects, predictive modeling and other advanced analytics applications.
The definition of big data is data that contains greater variety, arriving in increasing volumes and with more velocity. This is also known as the three Vs.
- the large volume of data in many environments;
- the wide variety of data types frequently stored in big data systems; and
- the velocity at which much of the data is generated, collected and processed.
These characteristics were first identified in 2001 by Doug Laney, then an analyst at consulting firm Meta Group Inc.; Gartner further popularized them after it acquired Meta Group in 2005. More recently, several other V's have been added to different descriptions of big data, including veracity, value and variability.