Storing data without analyzing it to gain meaningful insights from the data would be a waste of resources. Before we look at testing of big data it would be useful to understand how it is being used in the real world. E-commerce Amazon, Flipkart and other e-commerce sites have millions of visitors each day with … Continue reading Examples And Usage Of Big Data
Month: March 2019
Advantages and Disadvantages Of Using Big Data / Hadoop
Advantages Of Using Big Data / Hadoop 1. Scalable : Big data applications can be used to handles large volumes of data. This data can be in terms of petabytes or more. Hadoop can easily scale from one node to thousands of nodes based on the processing requirements and data. 2. Reliable : Big data … Continue reading Advantages and Disadvantages Of Using Big Data / Hadoop
Hadoop: Data Replication
HDFS is designed to reliably store very large files across machines in a large cluster. It stores each file as a sequence of blocks; all blocks in a file except the last block are the same size. The blocks of a file are replicated for fault tolerance. The block size and replication factor are configurable … Continue reading Hadoop: Data Replication
5 ways to understand Big Data
If you’re a Big Data enthusiast, by now you should understand that Big Data is not about “More Data”. Here are 5 ways to understand Big Data. 1. The Original Big Data: By original, we don’t mean the “correct” or “authentic”. By original, we mean the first definition coined 12 years ago by Doug Laney. … Continue reading 5 ways to understand Big Data
What is HDFS? An Introduction to HDFS
Hadoop is a critical big data framework, which has now been implemented in thousands of organisations. Hadoop frameworks make big data analytics easier, which is important since a large number of organisations today use data analytics in order to generate insights into how they should function to be better. HDFS or Hadoop Distributed File System … Continue reading What is HDFS? An Introduction to HDFS
Hadoop High Availability – HDFS Feature
1. Overview In this Hadoop tutorial, we will discuss the Hadoop High Availability feature. The tutorial covers an introduction to Hadoop High Availability, how high availability is achieved in Hadoop, what were the issues in legacy systems, and examples of High Availability in Hadoop. 2. Hadoop HDFS High Availability – Introduction Hadoop High Availability HDFS … Continue reading Hadoop High Availability – HDFS Feature
Hadoop Clusters
A Hadoop cluster can be defined as a special type of computational cluster designed to serve the purpose of storing and analysing huge amounts of data that is not structured, in a distributed computing environment. Clusters like this can run on Hadoop’s open source distributed processing software on low cost computers, commodity computers to be … Continue reading Hadoop Clusters