August 01, 2018

Object Oriented Analysis and Design Importance - Comprehensive


Text Book(s)
T1 Larman, C., Applying UML and Patterns, Pearson Education, 3rd Ed., 2005.
T2 Erich Gamma et al., Design Patterns: Elements of Reusable Object-Oriented Software, 1994

https://www.waseian.com/2018/08/object-oriented-analysis-and-design.htmlhttps://www.waseian.com/2018/08/object-oriented-analysis-and-design.html

Question 3)(i)
Online Mobile Recharge
Problem Statement:
   The event study 'Online Mobile Recharge' provides us the data about all the mobile service providers. This application offers us the complete info regarding any mobile service provider in terms of their policies, choices, profits, etc. Supposing, any Airtel buyer needs to have the info of all the schemes and services provided by the company, he/she can have the info and according to his suitability, he can recharge the mobile from the same application. The major benefit of this proposed system is to have the renewing facility of any service provider under the same roof.
End users
Service Provider:
  Service Provider is the one who is nothing but the mobile service provider like all the corporations who are giving the mobile connections come under this module. The functionality of this module is to mark the mobile renewing of their company based on the availability of balance in the admin account. Request derives from the user and it is going to be proved at the admin for the availability of balance and then the appeal is forwarded to the service provided to make the mobile recharge.
Third-party System Administrator:
The administrator is the one who observes all users and user dealings. Admin also observers all the Service Providers, all the user accounts, and amounts salaried by the user and amount salaried to Service providers. When the appeal given by the user admin checks the available balance in the user account then request is advanced to the Service Provider from where user request gets handled. Admin has the complete info related to the user and all the info related to the patterns and other info of dissimilar recharge coupons provided by the Service Providers. All the data is preserved at the Admin level. Admin is having the privileges to limit any user.
User:
  There are 2 categories in the user Module:
Registered User
Visitor
  Any person who wants to use the services of Online Mobile Recharge at any time from anywhere they should get listed in this application. After getting listed user can recharge the mobile at any time and from anywhere. Guest is the one who visits the Online Mobile Recharge application and has the complete data related to the Service Providers and can make the mobile recharge by entering the bank facts or by giving the credit card details.
Draw a Collaboration Diagram for the above case study clearly showing the Classes, Numbering of communications and relationships.
Answer)

IMPORTANT NOTE: In the question, only Collaboration Diagram is asked. Other diagrams are for reference.
Use-Case Diagram
Actors vs Use Cases:
User
•Register.
•Recharge.
•Select Payment Gateway.
•Select Service Provider.
•Make payment.

Third Party Administrator
•Forward User request to Service Provider.
•Track Complaints.

Third Party Server/ Database
•Authenticate the Registered users.
•Maintain the Log.

Service Provider
•Recharge the user requested either directly or through the third party system.
•Provide various plans to the user.

Online Mobile Recharge UML Use case Diagram:


Sequence Diagram
Sequence Diagram for a user to recharge his account through the third-party site:

Collaboration Diagram
Collaboration diagram for a user to recharge his account through the third-party site:

Class Diagram
Classes Identified:
  • User: Registered, Visitor
  • Third Party System Administrator
  • Third Party Server/ Database
  • Service Provider
  • Direct or Non- Third Party User (Direct access through Service Provider Site)

Activity Diagram
Activities:
  • User login and authentication for the Registered user.
  • Forward the request to service provider if logged in as an Administrator.
  • Enter service provider site for a direct user.
  • Enter recharge amount.
  • Select Payment Gateway.
  • Login and authenticate Bank Account.
  • Make payment.
  • Check for the recharge processed successfully or not.
Question3.(ii) 
Draw a UML Activity diagram clearly showing the interaction of Customer, Customer Browser, Google App, Google ACS Service, Identity provider for the Google Apps described below:
 Single Sign-On (SSO) for Google Apps
Purpose:
  An example of a UML activity diagram which describes Single Sign-On (SSO) to Google Apps for customers using some hosted Google application, such as Gmail.
Summary:

When a user tries to use some hosted Google app, such as Gmail, Google generates a (SAML) verification request and sends a redirect request back to the user's browser. Redirect points to the exact identity provider. SAML verification request contains the encrypted URL of the Google application that the user is trying to reach.
  Google acts as a service provider with services such as Gmail or Start Pages. Partner companies work as identifiers and control other passwords, passwords, and passwords used for identifying, authenticating and authorizing users for Google's Web applications. Each partner also provides Google with its SSO URL as well as a public key that Google will use to verify SAML
responses.

Answer: UML Activity diagram:
UML describe the structure, boundary, and behavior of the system as well as objects within it. It's not a programming language but there are tools which can be used to generate code in various languages.

Question4)(i) 
Factory Pattern: Explain through examples.
Answer )
Factory Method is to building objects as Template Method is to applying an algorithm. A super class identifies all standard and common behavior (using pure simulated "placeholders" for creation steps) and then delegates the creation details to sub classes that are delivered by the client.
  Factory Method creates a design more customizable and only a tiny more difficult. Other design patterns need new classes, while the Factory Method only requires a new process. Creating an object often requires difficult processes not appropriate to include within a composing object.
  The factory method design pattern handles these problems by defining a separate method for making the objects, which sub classes can then overrule to specify the derived type of product that will be formed.
Example:
The factory method allows an interface to create objects but the sub-classes can select which groups are immediate. Injection molding presses display this pattern. The manufacturers of plastic toys process plastic molding powder and inject plastic into the mold of desired shapes.
  The class of toy (car, action figure, etc.) is determined by the mold.

Question4)(ia)
How does the Factory Method Pattern fit in with other patterns/methods?
Answer ) 
  Factory classes are useful when you need a complicated process for building the object when the creation need a reliance that you do not want for the real class when you need to construct different objects etc.
  It is a good idea to use factory methods inside the object when:
  • Object's class doesn't know what exact sub-classes it has to create
  • Object's class is designed so that the objects it creates were specified by sub-classes
  • Object's class delegates its duties to auxiliary sub-classes and doesn't know what exact class will take these duties
  • The Factory Method allows these other patterns to defer instantiation to sub classes. One practices the Factory way to defer responsibility to sub class objects. 
Question4)(ib)
Factories increase cohesion. What is their rationale for saying so? 
Answer)
Cohesion means 'sticking together' and a factory is a method, an object, or anything else that is used to instantiate other objects. It helps to keep together both the functionality and the instruction that determines which objects should be built and/or managed under different circumstances.

Question4)(ic) 
Factories also help in testing. In what ways is this true?
Answer )
  • The using objects should act in exactly the similar way with any set of derivatives present. It should not test every possible combination, because it can test each piece individually. No matter how they are combined, the system will work in the same manner.
  • The advantage is that it can yield the same instance multiple times, or can return a subclass rather than an object of that exact type.
Question4.(ii)
Observer Pattern: Explain through examples
Answer)
  • Observer pattern (also called publishing-subscribed pattern) is a behavioral design pattern that defines multiple relationships between objects when an object changes its position, all dependent objects are notified and updates automatically.
  • Observer pattern is the basic standard in decoupling - separating objects which depend on each other.
Example:
  • The observer pattern is used in the model view controller (MVC) architectural pattern. In MVC observer pattern is used to decouple the model from the view. View symbolizes the Observer and the model is the Observable object.
  • Event management - For this scenario, the Observer patterns are widely used. Swing and DotNet are widely fulfilling the events mechanism.
Question4)(iia)
What is the intent of the Observer pattern? Under what circumstances should an Observer pattern not be used?
Answer)
The intent of the Observer Pattern:
  • Defines a one-to-many dependency between objects so that when one object changes state, all its dependents are notified and updated automatically.
  • Is should be used when there is a change of a state in one object must be reflected in another object without keeping the objects tight coupled.
  • It is also used when the framework we are writing needs to be enhanced in the future with new observers with minimal changes.
  • When a change to one object requires the change of a variable number of other objects (not necessarily known at compile-time).
  • When an object should be able to talk to another object, but you don't want them essentially dependent on each other.
Question4)(ii)
One example of the Observer pattern from outside of software is a radio station: It broadcasts its signal; anyone who is interested can tune in and listen when they want to. Give another example from "real-life" with explanation?
Answer)
  • Split-wise group: If anyone adds or updates an entry in the group for any amount- all members of the group get a notification regarding the update done.
  • Facebook: If one follows a post, then it is added to the observer and another comment is received on the same post, send a notification to all other supervisors. It is the same as twitter or any other social media.
  • Cricket Display: The scoreboard display, displays the average score etc information as per the current status of the match. Whenever any total score changes, the display board gets refreshed. So, the display board is the observer here and Subject is the panel sending the current score status to the board.
Question5.(a)
Name three Design Patterns which uses the Singleton design pattern in their Implementation.
Answer)
Singleton design pattern is used in below three implementations.
  • Abstract Factory: In an abstract factory pattern, an interface is responsible for making a factory of related items without clearly specifying its classes.
  • Builder: Builder pattern builds a complex object using simple objects and using a step by step approach.
  • Prototype: This design involves applying a sample interface which speaks to create a replica of the current object. 
Question5.(b)
When singleton pattern usage is unnecessary?
Answer)
  • Most of the time it is not unnecessary.
  • When it's easier to permit an object resource as a reference to the objects that need it, rather than letting objects access the resource globally. 
  • Everything that can be done with a singleton can be done with a class variable or method.
  •  And if something has to be unique, it has to belong to the class and not to the objects.
  • Therefore, good or bad, singletons are conceptually wrong.
Question5.(c)
Compare the operation of Adapter and Bridge Design Patterns.
Answer)
Adapter Design Pattern
  • The adapter makes things work after they're designed.
  • The Adapter pattern is more about getting your existing code to work with a newer system or interface.
  • It is useful to work with two incompatible interfaces.
  • Example: A case of the rd reader which acts as an adapter between the memory card and a laptop. You plugin the memory card into card reader and card reader into the laptop so that memory card can be read via laptop
Bridge Design Pattern
  • Bridge makes them work before they are.
  • The Bridge pattern lets you have another implementations of an algorithm or system.
  • It decouples an abstraction from its implementation and both can differ individually.
  • Example: A circle can be drawn in various different colors using the identical abstract class method but diverse bridge implementer classes.
Question5.(d)
Why Decorator Design Pattern is better than Adapter Design Pattern while handling Interfaces?
Answer)
  • Adapter pattern alters interface, Decorator pattern doesn't alter interface, it just implements the unique object's interface, so that it can be passed to a method, which receives an original object.
  • Adapter delivers a different interface to its subject. Decorator delivers an enhanced interface.
  • An adapter is destined to change the interface of an existing object. Decorator enhances another object without changing its interface. A decorator is thus more transparent to the application than an adapter is. As a significance, Decorator supports recursive composition, which isn't possible with pure Adapters
Question5.(e) 
How does the Façade and Adapter Design Pattern make use of Interfaces?
Answer:
  • Facade describes a new interface, whereas Adapter reprocesses an old interface.
  • Facade design pattern neither translates interfaces nor adds new functionality, instead it just deliver simpler interfaces. So instead of client nonstop accessing single components of a system, it uses facade. Facade design pattern agrees client interact with the difficult system with a much simpler interface and less work
  • The facade will then call individual components.

AUTOMATION TESTING - Technology SELENIUM L1

Question.
While using the accessor commands, if you want to initialize a variable and assign a value onto it, then we could use the
a) store command
b) init command
c) echo command
d) create command

Answer: store command

Question.
Which pane of the Selenium IDE provides the insight of current execution in the usage and form of messages and assists us to debug the concerns in case test case if execution fails?
a) Address Bar Pane
b) Test Case Pane
c) Log Pane
d) Error Pane

Answer: Log Pane

Question.
Each Selenium IDE test step can be divided into 3 components. What are they?
a) Command, Value, and Type
b) Command, Target, and Type
d) Command, Type, and Value
d) Command, Target, and Value

Answer: Command, Target, and Value

Question.
What would be correct selenese command, target and value if we want to enter the value 'steve@gmail.com' for id=email ?
a) Command -> enter Target -> id=email Value -> steve@gmail.com
b) Command -> enter Target -> steve@gmail.com Value -> id=email
c) Command -> type Target -> steve@gmail.com Value -> id=email
d) Command -> type Target -> id=email Value -> steve@gmail.com

Answer: Command -> type Target -> id=email Value -> steve@gmail.com

Question.
Every action/command that we use in the Selenium IDE is internally designed and developed as a function of
a) HTML function
b) Java function
c) XML function
d) JavaScript function

Answer: JavaScript function

Question.
Selenium IDE has a color coding component for reporting purpose. When the execution is done, what color is the test case marked in to signify the successful run of it?
a) Red
b) Purple
c) Blue
d) Green

Answer: Green

Question.
Which option of Selenium IDE can be used to ascertain that the locator value provided in the Target text box is indeed correct and identifies that designated web element on the GUI?
a) Search button
b) Find button
c) Select button
d) View button

Answer: Find button

Question.
Which of the statements describe assertTitle and verifyTitle commands correctly?
a) assertTitle will check that the title value has to be correct. Otherwise, it fails and it ends the play of other added steps. On the other side verifyTitle will check that the title value has to be correct, but if incorrect, it marks it as failed but proceeds with other steps

b) assertTitle or verifyTitle works the same way to check whether the title value is correct. There is no difference between them.

c) assertTitle will check that the title value has to be correct. Otherwise, it marks it as failed but proceeds with other steps. On the other side, verifyTitle will check that the title value has to be correct. Otherwise, it terminates the play of other steps.

d) assertTitle checks whether web page title is correct whereas verifyTitle checks whether the web application title is correct

Answer: assertTitle will check that the title value has to be correct. Otherwise, it fails and it terminates the play of other steps. On the other side verifyTitle will check that the title value has to be correct, but if incorrect, it marks it as failed but proceeds with other steps

Question.
Which of the below is NOT an advantage of automated testing?
a) High ROI
b) Improves accuracy
c) Attended execution
d) Reduces human generated errors

Answer: Attended execution

Question.
Which of the options given is an 'Action' type of command of Selenese?
a) assertTitle
b) click
c) verifyTitle
d) waitFor

Answer: click

Question.
The technique or method used by the Selenium IDE to find and access a particular element on your web page is called
a) Recording technique
b) Playback technique
c) Locator strategy
d) User Experience strategy

Answer: Locator strategy

Question.
Custom made commands that we could create as a JavaScript function can extending the Selenium IDE by adding
a) Parameters
b) User Extension
c) File Logging Extension
d) Firebug Extension

Answer: User Extension

Question.
Which of the below is NOT a feature of Selenium?
a) Commercial
b) Has different products in its suite
c) Supports multiple language implementations
d) Supports multiple platforms

Answer: Commercial

Question.
Selenium IDE is implemented as a plug-in of
a) Google Chrome browser
b) Opera browser
c) Microsoft Edge browser
d) Firefox browser

Answer: Firefox browser

Question.
Which of the statements about the Selenese commands is FALSE?
a) All the selenese commands take mandatorily 2 arguments
b) Commands can be of action type, accessor type or assertion type
c) Commands of Selenium remain internally applied as JavaScript functions
d) Commands help us to perform some test steps

Answer: All the selenese commands take mandatorily 2 arguments

Question.
Which of the possibilities is NOT a dis-advantage of Selenium IDE?
a) No support for iterations and conditional statements
b) No logging capabilities
c) No test script dependencies and grouping possible
d) No database testing

Answer: No logging capabilities

Question.
The Selenium IDE commands can be categorized mainly into 3 categories. What are they?
a) Actions, Renderers, Viewers
b) Actions, Accessors, Assertions
c) Actions, Recorders, Playback
d) Do, view, get

Answer: Actions, Accessors, Assertions

Question.
Which Selenium IDE plug-in is required to enable the tester to save log messages into an external file?
a) Debugger plugin
b) File Logging plugin
c) Log Metrics plugin
d) Error Metrics plugin

Answer: File Logging plugin

Question.
Which of the choices provided is NOT Custom made commands that we might
a) Selenium IDE runs recording and playback feature for ease of creating test cases
b) Selenium IDE delivers several selenese commands to help create test cases easily
c) Selenium IDE needs us to write functionalities in Java or any other programming language
d) Selenium IDE supports logging execution messages to external log files if required

Answer: Selenium IDE requires us to write functionalities in Java or any other programming language

Question.
The Selenese command that can be used to check if the application title is correct or not is
a) assertTitle
b) assertAppTitle
c) assertValidTitle
d) assertValidAppTitle

Answer: assertTitle

Brief about Book Library Data Warehouse System

Topic: BOOK LIBRARY
Subject: DATA WAREHOUSE
Prepared by: Sumit

Q1) Identify the business processes of interest to senior management in the industry (domain) allocated to your group.
Answer)
Major libraries have large collections and circulation. Managing libraries electronically has resulted in the creation and management of large library databases, Library to the students and teachers who are cooperating in this e-learning environment.

Below are some of the business processes of interest to senior management:
  • Variety of Books: Need to better understand what books customers wanted and were willing to pay for. 
  • Fund the Books: Need to change its costs and cash flow so that the book library could continue to operate. 
  • Make Library Reliable: It has to be a library that has its customers to their wanted books on-time.
  • Book Borrowing
A crucial part of a library is the human intermediary the librarian. This intermediary connects the users to the information needed and can assist with advice about using the information retrieval systems and working with information.

Q2) List some questions that would be raised by senior management for improving the business process.
Answer)
There are many questions that can be asked by senior management for improving the above business process.
Some of the questions that will be asked are :
  • When the item was collected?
  • Which librarian registered it?
  • What is the item about?
  • Which branch library the item was registered at?
Q3) To address the above-mentioned questions; propose a DW design (schema diagram).
Answer)
In general for a DW Design basically four main characteristics are used:
Step 1: Identify the Business Process
Step 2: Declare the Grain
Step 3: Identify the Dimensions
Step 4: Identify the Facts

Our Book Library case, the following are steps:
  1. Business Process: Book borrowing is the business process.
  2. Declare the Grain: The second step is to declare the grain of the business process. In the book borrowing process, we declare a transaction issued in library automation system as the grain, which means an item is borrowed by a patron.
  3. Identify the Dimensions: The third step is to choose the dimensions. Dimensions represent how people describe and inspect the data from the process. Following are dimension table I will be using :
    • The Patron-Dimension describes the library patron’s characteristics. The attributes of Patron-Dimension include the name of the patron, gender, occupation, patron type, department, college, and so on.
    • The Item-Dimension describes every item belonging to the library, and its attributes indicating what relating to this item, including call number, title, author, subject, classification, language, location, MARC, collecting source, and so on. 
    • The Location-Dimension describes branch libraries supervised by the city library, and its attributes include the name of the branch library, named of the district it is located and the name of region library.
    • The Date-Dimension describes every hour of one day, and its attributes include hour, date, week, month and year. 
  4. Identify the Facts: The fourth step is to identify the facts. In the case of book borrowing, we identify the fact to measure the number of books borrowed. We declared a transaction that an item was borrowed by a patron as the grain in the prior step. Thus, the number of books borrowed here is equal to one.
  • The star schema is perhaps the simplest data warehouse schema.
  • It is called a star schema because the entity-relationship diagram of this schema resembles a star, with points radiating from a central table. 
  • The center of the star consists of a large fact table and the points of the star are the dimension tables.
Star Schema for Library Book Borrowing:


Q4) List aggregations to improve the DW performance. Justify.
Answer)
  • Aggregates provide improvements in performance because of the significantly smaller number of records.
  • Aggregates allow quick access to Book Dimension data during reporting. Similar to database indexes, they serve to improve performance.
  • Aggregates are particularly useful in the following cases:
    • Executing and navigating in query data leads to delays if you have a group of queries
    • You want to speed up the execution and navigation of a specific query
    • You often use attributes in queries
    • You want to speed up reporting with specific hierarchies by adding a level of a specific hierarchy.
  • Aggregates are particularly useful in the following cases:
  • If the aggregate contains data that is to be evaluated by a query, the query data is read automatically from the aggregate.
  • Query: Total sales for books during the first week of December 2000 for location Mumbai.

Q5) List and justify any 5 metadata items that will be of interest to various stakeholders.
Answer)
  • Metadata means "data about data". 
  • Data that provides information about one or more aspects of metadata data is defined as; It is used to summarize the basic information about the data that can be tracked and can work with specific data.
  • Below are metadata items of various interest to stakeholders:
    • Purpose of the book
    • Time and date of issuing the book
    • Creator or author of the book
    • Location on a computer network where the book was issued.
    • Book quantity
    • Book quality
  • Below are metadata items of various interest to stakeholders:
Types of Meta Data:
  • Descriptive metadata is usually used for search and identification, such as searching and finding an object, such as title, author, topic, keyword, and publisher.
  • Administrative metadata provides information to help manage the source. Administrative metadata refers to the technical information, including file type, or when and how the file was created.
  • Structural metadata describes how components of an object are organized. An example of structural metadata will be how the pages are ordered to make chapters of a book.
Following are some key points that to be included in MetaData:

Definition of data warehouse − It includes the description of the structure of data warehouse. The description is defined by schema, view, hierarchy, derivative data definitions, and data mart locations and materials.

Operational Metadata − It includes currency of data and data lineage. The currency of the data means that the data is active, stored or pure, or not. The genealogy of the data means the history of the migrated data and the changes applied to it.

Business metadata − It has the data ownership information, business definition, and changing policies

July 29, 2018

Analytics Skills - Technology DataStage-L1


Question.
Continue action if a lookup on a link fails

a)Drops the row and Job fails
b)Drops the row and will skip next lookup
c) Drops the row and will skill all further lookups
d) Drops the row and continues with the next lookup

Answer: Drops the row and continues with the next lookup

Question.
Fail action if a lookup on a link fails

a) Causes the job to reject records
b) Causes the job to issue a fatal error and continues with next lookup
c) Causes the job to issue a fatal error and stop
d) Causes the job to issue a fatal error and falls records from subsequent lookup

Answer: Causes the job to issue a fatal error and stop

Question.
Which phase is used to compute sum of salary collected together by deptno?
a)Join
b)Aggregate
c)Merge
d) Copy

Answer: Aggregate

Question.
Which activity is used to execute shell scripts or bat files?

a) Wait for File activity
b) Execute Command activity
c) Program activity
d) Run Program activity

Answer: Execute Command activity

Question.
Change_Code value of three of Change Capture Stage in DataStage represents

a) Copy
b) Delete
c) New
d) Edit

Answer: Edit

Question.
Change_Code value of zero of Change Capture Stage in DataStage represents

a) Edit
b) Delete
c) New
d) Copy

Answer: Copy

Question.
Change_Code value of one of Change Capture Stage in DataStage represents
a) Copy
b) Delete
c) Edit
d) New

Answer: New

Question.
Which one is used to remove duplicates in data?

a) Unique property set in Join Stage
b)Unique property set in Merge Stage
c)Remove Duplicate Stage
d)Dedup Sort Stage

Answer: Remove Duplicate Stage

Question.
Which of the below stages are used to achieve Union all operation on input data sources?

a)Join
b) Funnel
c) Lookup
d) Filter

Answer: Funnel

Question.
Which of the following is not type of view in Datastage Director?

a)Job View
b)Log View
c)Status View
d) Parallel View

Answer: Parallel View

Question.
Which one of the following can be used to schedule jobs

a) DataStage Director
b)DataStage Designer
c)DataStage Administrator
d) Data Stage Exporter

Answer: DataStage Director

Question.
Which one is used to create workflows in DataStage?

a) Parallel Jobs
b) Sequence Jobs
c) Server Jobs
d) Workflow Activity

Answer: Sequence Jobs

Question.
Change_Code value of two of Change Capture Stage in DataStage represents

a) Copy
b) Delete
c) New
d) Edit

Answer: Delete

Question.
Change Capture stage is

a) File Stage
b) Database Stage
c) Processing Stage
d) Miscellaneous Stage

Answer: Processing Stage

Question.
Which Stage allows you to specify several Reject links?

a) Lookup
b)Merge
c)Join
d) filter

Answer: Merge

Question.
If two rows in Change Capture have same key columns, you can match the columns in the rows to understand if one is an modified copy of the other.

a) Key Column
b) Value Columns
c)After Data Column
d)Before Data Column

Answer: Value Columns

Question.
Which action will wait for file to seem in a folder?

a)Wait for file activity
b)Job Activity
c)Execute Command Activity
d)Sequential file Stage activity

Answer: Wait for file activity

Question.
Which of the below phases are used to Restrict Data created on Where Clause Predicates?

a)Join
b)Funnel
c) Filter
d) Copy

Answer: Filter

Question.
using Copy stage

a)Order of columns can be changed but data type of columns cannot be changed
b)Order of columns cannot be changed but data type of columns can be changed
c)Order of columns can be changed and data type of columns can be changed
d) both the Order of columns and data type of columns cannot be changed

Answer: Order of columns can be changed but data type of columns cannot be changed

Question.
Which option will send record with null values when a lookup failure happens

a) Reject
b) Drop
c) Fail
d) Continue

Answer: Continue

July 25, 2018

Analytics - Hadoop L1


Question.
Which of the following are true about Hadoop?
Open Source
Distributed Processing Framework
Distributed Storage Framework
All of these

Answer: All of these

Question.
Which of the following are false about Hadoop?
Hadoop works in Master-Slave fashion
Master & Slave both are worker nodes
User submit his work on master, which distribute it to slaves
Slaves are actual worker node

Answer: Master & Slave both are worker nodes

Question.
What is a Metadata in Hadoop?
Data stored by user
Information about the data stored in datanodes
User information
None of these

Answer: Information about the data stored in datanodes

Question.
What is a Daemon?
Process or service that runs in background
Applications submitted by user
Web application running on web server
None of these

Answer: Process or service that runs in background

Question.
All of the following accurately describe Hadoop EXCEPT?
a. Batch processing
b.Open-source
c. Distributed computing
d. Real-time

Answer: Real-time

Question.
All of the following is a core component of Hadoop EXCEPT?
a. Hive
b. HDFS
c. MapReduce
d. YARN

Answer: Hive

Question.
Hadoop is a framework that uses a variety of related tools. Common tools included in a typical implementation include:
a. MapReduce, HDFS, Spool
b. MapReduce, MySQL, Google Apps
c. Cloudera, HortonWorks, MapR
d. MapReduce, Hive, Hbase

Answer: MapReduce, Hive, Hbase

Question.
Which of the following can be used to create workflows when multiple MapReduce and Pig programs need to be executed?
a. Sqoop
b. Zookeeper
c. Oozie
d. Hbase

Answer: Oozie

Question.
Which of the following can be used to transfer bulk data between Hadoop and structured databases
a. Sqoop
b. Hive
c. Pig
d. Spark

Answer: Sqoop

Question.
How many single points of failure does a High Availability HDFS architecture have?
a. 0
b. 1
c. 2
d. 3

Answer: 0

Question.
If a file of size 300MB needs to be stored in the HDFS (block size=64MB, replication factor=2), how many blocks are created for this file in the HDFS?
a. 10
b. 11
c. 12
d. 15

Answer: 10

Question.
What is not a default value for a data block size in the HDFS?
a. 64MB
b. 128MB
c. 512MB
d. 256MB

Answer: 512MB

Question.
Which of the following architectures best describes the HDFS architecture?
a. High Availability
b. Master-Slave
c. Connected
d. Peer

Answer: Master-Slave

Question.
Which of the following is a master process in the HDFS architecture?
a. Datanode
b. JobTracker
c. Namenode
d. Secondary Namenode

Answer: Namenode

Question.
Which of the following is true about Hadoop?

Before storing data we need to specify the schema
We will loss data if one data node crashes
We can add n no of nodes in cluster on the fly (n ~ 15000)
Data is firstly processed on master then on slaves

Answer: We can add n no of nodes in cluster on the fly (n ~ 15000)

Question.
Choose the correct statement?

Master assigns work to all the slaves
We cannot edit data once written in Hadoop
Client need to interact with master first, as it is the single place where all the meta data is available
All of these

Answer: All of these

Question.
Which of the following is the essential module of HDFS?
Node Manager
Resource Manager
DataNode
ALL of the above

Answer: DataNode

Question.
Which of the below is NOT a kind of metadata in NameNode?

Block locations of files
List of files
File access control information
No. of file records

Answer: No. of file records

Question.
Which statement is true about DataNode?

It is the actual worker node that saves and stores meta data.
It is the slave node that saves and stores metadata.
It is the Master node that saves and stores actual data.
It is the slave node that saves and stores actual data.


Answer: It is the slave node that saves and stores actual data.

Question.
Is the Secondary NameNode is the Backup node?
TRUE
FALSE

Answer: FALSE

Question.
Which of the below is programming model planned for handling out large capacities of data in parallel by dividing the effort into a set of independent tasks.

MapReduce
Hive
Pig
HDFS

Answer: MapReduce

Question.
Mappers sorted output is Input to the-
Reducer
Mapper
Shuffle
All of the mentioned

Answer: Reducer


Question.
Which of the following generate intermediate key-value pair?
Reducer
Mapper
Combiner
Partitioner

Answer: Mapper

Question.
What is the major advantages of storing data in block size 128MB?
It saves disk seek time
It saves disk processing time
It saves disk access time
It saves disk latency time

Answer: It saves disk seek time

Question.
Role of Partitioned in Map Reduce Job is :

a) To partition input data into equal parts
b) Distribute data among available reducers
c) To partition data and send to each mapper
d) Distribute data among available mappers

Answer:  Distribute data among available reducers

Question.
Which of the following is Single point of Failure?
NameNode
Secondary NameNode
DataNode
None of above

Answer: NameNode

Question.
Apache Hbase is

a) Column family oriented NoSQL database
b) Relational Database
c) Document oriented NoSQL database
d) Not part of Hadoop eco system

Answer: Column family oriented NoSQL database

Question.
Which of the following is a Table Type in Hive ?

a)Managed Table
b)Local Table
c)Persistent Table
d)Memory Table

Answer: Managed Table

Question.
Which of the following is a demon process in Hadoop?

a) NameNode
b) JobNode
c) taskNode
d) mapreducer

Answer: NameNode

Question.
Information about locations of the blocks of a file is stored at ________

a)data nodes
b)name node
c)secondary name node
d)job tracker

Answer: name node

Question.
Apache Sqoop is used to

a) Move data from local file system to HDFS
b) Move data from streaming sources to HDFS
c) Move data from RDBMS to HDFS
d) Move data between Hadoop Clusters

Answer: Move data from RDBMS to HDFS

Question.
In a Map Reduce Program, role of combiner is

a) To combine output from multiple map tasks
b) To combine output from multiple reduce tasks
c) To merge data and create a single output file
d) To aggregate the output of each map task

Answer: To aggregate the output of each map task

Question.
Hive External tables store data in

a) default Hive warehouse location in HDFS
b) default Hive warehouse location in Local file system
c) a custom location in HDFS
d) a custom location in local file system

Answer: a custom location in HDFS

Question.
MapReduce programming model is ________

a)Platform Dependent but not language-specific
b)Neither platform- nor language-specific
c)Platform independent but language-specific
d)Platform Dependent and language-specific

Answer: Neither platform- nor language-specific

Question.
Hive generates results using

a) DAG of Map Reduce Jobs
b) sequencial processing of files
c) MySQL query engine
d) List processing

Answer: DAG of Map Reduce Jobs

Question.
Clients access the blocks directly from ________for read and write

a)data nodes
b)name node
c)secondarynamenode
d)primary node

Answer: data nodes

Question.
In Apache Pig, a Data Bag stores

a) Set of columns
b) set of columns with the same data type
c) set of columns with different data type
d) Set of tuples

Answer: set of columns with the same data type

Question.
You can execute a Pig Script in local mode using the following command

a) pig -mode local
b) pig -x local
c) pig -run local
d) pig -f

Answer: pig -x local

Question.
Default bock size in HDFS is____________

a)128 KB
b)64 KB
c)32 MB
d)128MB

Answer:128MB

Question.
Apache Flume is used to

a) Move data from RDBMS to HDFS
b) Move data from HDFS to RDBMS
c) Move data from One HDFS Cluster to another
d) Move data from Streaming source to HDFS

Answer: Move data from Streaming source to HDFS

Question.
Default data field delimiter used by Hive is

a) Ctrl-a character
b) Tab
a) Ctrl-b character
d) Space

Answer: Ctrl-a character

Question.
What are the characteristics of Big Data?

a)volume, quality, variety
b)volume,velocity, variety
c)volume, quality, quantity
d)qantity and quality only

Answer: volume,velocity, variety

Question.
Which is optional in map reduce program?

a)Mapper
b)Reducer
c)both are optional
d)both are mandatory

Answer: Reducer

Question.
In Hive tables, each table partition data is stored as ?

a) files in separate folders
b) multiple files in same folder
c) a single file
d) multiple xml files

Answer: files in separate folders

Question.
What is the default storage class in Pig Called ?

a)TextStorage
b)DefaultStorage
c)PigStorage
d)BinaryStorage

Answer: PigStorage