July 25, 2018

Analytics - Hadoop L1


Question.
Which of the following are true about Hadoop?
Open Source
Distributed Processing Framework
Distributed Storage Framework
All of these

Answer: All of these

Question.
Which of the following are false about Hadoop?
Hadoop works in Master-Slave fashion
Master & Slave both are worker nodes
User submit his work on master, which distribute it to slaves
Slaves are actual worker node

Answer: Master & Slave both are worker nodes

Question.
What is a Metadata in Hadoop?
Data stored by user
Information about the data stored in datanodes
User information
None of these

Answer: Information about the data stored in datanodes

Question.
What is a Daemon?
Process or service that runs in background
Applications submitted by user
Web application running on web server
None of these

Answer: Process or service that runs in background

Question.
All of the following accurately describe Hadoop EXCEPT?
a. Batch processing
b.Open-source
c. Distributed computing
d. Real-time

Answer: Real-time

Question.
All of the following is a core component of Hadoop EXCEPT?
a. Hive
b. HDFS
c. MapReduce
d. YARN

Answer: Hive

Question.
Hadoop is a framework that uses a variety of related tools. Common tools included in a typical implementation include:
a. MapReduce, HDFS, Spool
b. MapReduce, MySQL, Google Apps
c. Cloudera, HortonWorks, MapR
d. MapReduce, Hive, Hbase

Answer: MapReduce, Hive, Hbase

Question.
Which of the following can be used to create workflows when multiple MapReduce and Pig programs need to be executed?
a. Sqoop
b. Zookeeper
c. Oozie
d. Hbase

Answer: Oozie

Question.
Which of the following can be used to transfer bulk data between Hadoop and structured databases
a. Sqoop
b. Hive
c. Pig
d. Spark

Answer: Sqoop

Question.
How many single points of failure does a High Availability HDFS architecture have?
a. 0
b. 1
c. 2
d. 3

Answer: 0

Question.
If a file of size 300MB needs to be stored in the HDFS (block size=64MB, replication factor=2), how many blocks are created for this file in the HDFS?
a. 10
b. 11
c. 12
d. 15

Answer: 10

Question.
What is not a default value for a data block size in the HDFS?
a. 64MB
b. 128MB
c. 512MB
d. 256MB

Answer: 512MB

Question.
Which of the following architectures best describes the HDFS architecture?
a. High Availability
b. Master-Slave
c. Connected
d. Peer

Answer: Master-Slave

Question.
Which of the following is a master process in the HDFS architecture?
a. Datanode
b. JobTracker
c. Namenode
d. Secondary Namenode

Answer: Namenode

Question.
Which of the following is true about Hadoop?

Before storing data we need to specify the schema
We will loss data if one data node crashes
We can add n no of nodes in cluster on the fly (n ~ 15000)
Data is firstly processed on master then on slaves

Answer: We can add n no of nodes in cluster on the fly (n ~ 15000)

Question.
Choose the correct statement?

Master assigns work to all the slaves
We cannot edit data once written in Hadoop
Client need to interact with master first, as it is the single place where all the meta data is available
All of these

Answer: All of these

Question.
Which of the following is the essential module of HDFS?
Node Manager
Resource Manager
DataNode
ALL of the above

Answer: DataNode

Question.
Which of the below is NOT a kind of metadata in NameNode?

Block locations of files
List of files
File access control information
No. of file records

Answer: No. of file records

Question.
Which statement is true about DataNode?

It is the actual worker node that saves and stores meta data.
It is the slave node that saves and stores metadata.
It is the Master node that saves and stores actual data.
It is the slave node that saves and stores actual data.


Answer: It is the slave node that saves and stores actual data.

Question.
Is the Secondary NameNode is the Backup node?
TRUE
FALSE

Answer: FALSE

Question.
Which of the below is programming model planned for handling out large capacities of data in parallel by dividing the effort into a set of independent tasks.

MapReduce
Hive
Pig
HDFS

Answer: MapReduce

Question.
Mappers sorted output is Input to the-
Reducer
Mapper
Shuffle
All of the mentioned

Answer: Reducer


Question.
Which of the following generate intermediate key-value pair?
Reducer
Mapper
Combiner
Partitioner

Answer: Mapper

Question.
What is the major advantages of storing data in block size 128MB?
It saves disk seek time
It saves disk processing time
It saves disk access time
It saves disk latency time

Answer: It saves disk seek time

Question.
Role of Partitioned in Map Reduce Job is :

a) To partition input data into equal parts
b) Distribute data among available reducers
c) To partition data and send to each mapper
d) Distribute data among available mappers

Answer:  Distribute data among available reducers

Question.
Which of the following is Single point of Failure?
NameNode
Secondary NameNode
DataNode
None of above

Answer: NameNode

Question.
Apache Hbase is

a) Column family oriented NoSQL database
b) Relational Database
c) Document oriented NoSQL database
d) Not part of Hadoop eco system

Answer: Column family oriented NoSQL database

Question.
Which of the following is a Table Type in Hive ?

a)Managed Table
b)Local Table
c)Persistent Table
d)Memory Table

Answer: Managed Table

Question.
Which of the following is a demon process in Hadoop?

a) NameNode
b) JobNode
c) taskNode
d) mapreducer

Answer: NameNode

Question.
Information about locations of the blocks of a file is stored at ________

a)data nodes
b)name node
c)secondary name node
d)job tracker

Answer: name node

Question.
Apache Sqoop is used to

a) Move data from local file system to HDFS
b) Move data from streaming sources to HDFS
c) Move data from RDBMS to HDFS
d) Move data between Hadoop Clusters

Answer: Move data from RDBMS to HDFS

Question.
In a Map Reduce Program, role of combiner is

a) To combine output from multiple map tasks
b) To combine output from multiple reduce tasks
c) To merge data and create a single output file
d) To aggregate the output of each map task

Answer: To aggregate the output of each map task

Question.
Hive External tables store data in

a) default Hive warehouse location in HDFS
b) default Hive warehouse location in Local file system
c) a custom location in HDFS
d) a custom location in local file system

Answer: a custom location in HDFS

Question.
MapReduce programming model is ________

a)Platform Dependent but not language-specific
b)Neither platform- nor language-specific
c)Platform independent but language-specific
d)Platform Dependent and language-specific

Answer: Neither platform- nor language-specific

Question.
Hive generates results using

a) DAG of Map Reduce Jobs
b) sequencial processing of files
c) MySQL query engine
d) List processing

Answer: DAG of Map Reduce Jobs

Question.
Clients access the blocks directly from ________for read and write

a)data nodes
b)name node
c)secondarynamenode
d)primary node

Answer: data nodes

Question.
In Apache Pig, a Data Bag stores

a) Set of columns
b) set of columns with the same data type
c) set of columns with different data type
d) Set of tuples

Answer: set of columns with the same data type

Question.
You can execute a Pig Script in local mode using the following command

a) pig -mode local
b) pig -x local
c) pig -run local
d) pig -f

Answer: pig -x local

Question.
Default bock size in HDFS is____________

a)128 KB
b)64 KB
c)32 MB
d)128MB

Answer:128MB

Question.
Apache Flume is used to

a) Move data from RDBMS to HDFS
b) Move data from HDFS to RDBMS
c) Move data from One HDFS Cluster to another
d) Move data from Streaming source to HDFS

Answer: Move data from Streaming source to HDFS

Question.
Default data field delimiter used by Hive is

a) Ctrl-a character
b) Tab
a) Ctrl-b character
d) Space

Answer: Ctrl-a character

Question.
What are the characteristics of Big Data?

a)volume, quality, variety
b)volume,velocity, variety
c)volume, quality, quantity
d)qantity and quality only

Answer: volume,velocity, variety

Question.
Which is optional in map reduce program?

a)Mapper
b)Reducer
c)both are optional
d)both are mandatory

Answer: Reducer

Question.
In Hive tables, each table partition data is stored as ?

a) files in separate folders
b) multiple files in same folder
c) a single file
d) multiple xml files

Answer: files in separate folders

Question.
What is the default storage class in Pig Called ?

a)TextStorage
b)DefaultStorage
c)PigStorage
d)BinaryStorage

Answer: PigStorage