August 04, 2018

Data Warehouse Reference - QnA

Question.
How can you apply the data to the warehouse? What are the modes?
Answer:
Data may be applied in the following four different modes: load, append, destructive merge, and constructive merge. Let us understanding of the effect of applying data in each of these four modes:

Load: If the target table to be loaded already exists and data exists in the table, the load process wipes out the existing data and applies the data from the incoming file. If the table is already empty before loading, the load process simply applies the data from the incoming file.

Append:You may think of the append as an extension of the load. If data already exists in the table, the append process unconditionally adds the incoming data, preserving the existing data in the target table. When an incoming record is a duplicate of an already existing record, you may define how to handle an incoming duplicate. The incoming record may be allowed to be added as a duplicate. In the other option, the incoming duplicate record may be rejected during the append process.

Destructive Merge : Merge In this mode, you apply the incoming data to the target data. If the primary key of an incoming record matches with the key of an existing record, update the matching target record. If the incoming record is a new record without a match with any existing record, add the incoming record to the target table.

Constructive Merge: This mode is slightly different from the destructive merge. If the primary key of an incoming record matches with the key of an existing record, leave the existing record, add the incoming record, and mark the added record as superseding the old record.

Question.
Let's say that the data warehouse for Big_University consists of four dimension students, courses, semesters and trainers, and there are two measurements and avg_grade. At the lowest ideological level (eg, for a given student, curriculum, semester and trainer combination), avg_grade measures the student's actual course grade. At higher conceptual levels, avg_grade stores the average grade for the given combination. Draw a snowflake schema diagram.

Answer:
http://www.waseian.com/2018/08/data-warehouse-comprehensive2015-16.html
Question.
Based on current trends in technology need to design information systems . Explain the points to be taken care with respective traditional operational systems and the newer informational systems that need to be built?
Answer:
The essential reason for the lack of ability to provide strategic facts is that we have been trying all along to provide strategic facts from the operational systems. These operational systems such as command processing, record control, dues and claims processing, casualty billing, and so on are not planned or intended to deliver strategic information. If we need the skill to provide strategic data and information, we must get the information from overall different types of systems. Specially designed decision care systems or informational systems can deliver strategic information.
  We find that in order to provide strategic information we need to build informational systems that are different from the operational systems we have been building to run the basic business. It will be worthless to continue to dip into the operational systems for strategic information as we have been doing in the past. As companies face fiercer competition and businesses become more complex, continuing the past practices will only lead to disaster.
  • Watching the wheels of business turn
  • Show me the top-selling products
  • Show me the problem regions
  • Tell me why (drill down)
  • Let me see other data (drill across)
  • Show the highest margins
  • Alert me when a district sells below target
http://www.waseian.com/2018/08/data-warehouse-comprehensive2015-16.html
We need to design and build informational systems
  • That serve different purposes
  • Whose scopes are different
  • Whose data content is different
  • Where the data usage patterns are different
  • Where the data access types are different
Question.
2-D data pulled out from the data cube.


Product ID
Location ID
Number Sold
1
1
10
1
3
6
2
1
5
2
2
22

Represent the above into 3-D format, focussing majorly  on product-id and sales


Answer:
Product ID
Location ID
Total Sold
1
10
-
6
16
2
5
22
-
27
Total
15
22
6
43











Question.5
What is a  OLAP cube?                                                                                                        


Answer
An OLAP data cube is a representation of data in multiple dimensions, using facts and dimensions. It is characterized by the combination of information according to it’s relationship. It can consist in a collection of 0 to many dimensions, representing specific data. 
There are five basic operation to perform on these kind of data cubes: 
  1. Slicing
  2. Dicing
  3. Roll-Up
  4. Drill-Up
  5. Drill-Down
  6. Pivoting
Question
Why is dimensional normalization not required?
Answer
Dimensional normalization allows to solve database related problems. It is used to remove unnecessary features which are used as De-normalized dimensions. Dimensions have sub-dimensions which are added together. Due to this fact dimensional generalization is not used:
  • Data structure is more complex and which can cause performance to be degraded because it needs to be included in tables and relationships are retained
  • Query Performance suffers while collecting or retrieving multiple dimensional values It requires proper analysis and operational reports.
  • Space is not used properly and more space is needed.
Question.
What are the steps involved in creating dimensional modeling process?
Answer:
The business process of the dimensional modeling includes:

(a) Choose The Business Process: In this, 4-step design method is followed that helps to provide the usability of the dimensional model. This allows the business process to be more systematic in representation and more helpful in explaining it as well. It includes the use of Business Process Modelling Notation (BPMN) or Unified Modelling Language (UML).

(b)Declaring The Grain: After choosing the business process, the declaration of the model comes that consists of grains. The grain of the model provides the accurate description of the dimensional model and allows the focus should be shifted there.

(c)Identify The Dimensions:In this phase, the dimension is identified in the dimensional model. Dimensions are defined in cereals which are defined in the declaration part above. Dimensions acts as a foundation of the fact table where the data gets collected that comes under the fact. 

(d) Identify The Facts: Defining the dimensions provides a way to create a table in which the fact data can be stored. These facts are populated on the basis of the numerical figures and facts.

Question.
Consider a data warehouse, where the fact data is calculated to be 36GB of data per year, and 4 years’ worth of data are to be kept online. The data is to be partitioned by month and four concurrent queries are to be allowed.
Compute the partition size, Temporary Space and Space Required for this scenario. 
Answer:
Partition size P = 36GB per year / 12 = 3 GB
T = (2n +1)P = [(2 x 4) + 1]3 = 27 GB
F = 36GB X 4 years = 144 GB
Space Required = 3.5F + T = 3.5 X 144 + 27 = 531 GB

Question.
Discuss the merits and demerits of using views from the perspective of security of data warehouse.
Answer:
Views are easier option to define security initially. Later it will cause challenges.
Some of the common restrictions that may apply to the handling of views are:
  •     restricted data manipulation language (DML) operations,
  •     lost query optimization paths,
  •     restrictions on parallel processing of view projections.
The use of views to enforce security will impose a maintenance overhead. In particular, if views are used to enforce restricted access to data tables and aggregations, as these changes, the views may also change.
Question.
 For following statements, indicate True or False with proper justification:

A.    It is a good practice to drop the indexes before the initial load.
True.  Index entry creations during mass loads can be too time-consuming. So drop the indexes prior to the loads to make the loads go quicker. You may rebuild or regenerate the indexes when the loads are complete 

B.    The choice of index type depends on cardinality.
True. Bit-map index can be used only for low cardinality data

C.    The importance of metadata is the same for data warehouse and an operational system.
False.  In an operational system, users get information thru predefined screens and reports. In DW, users seek information thru ad-hoc queries.

D.    Backing up the data warehouse is not necessary because you can recover data from the source systems.
False. Information in DW is accumulated over long periods and elaborately preprocessed
 
E.    MPP is a shared-memory parallel hardware configuration.
False.  MPP is a share-nothing hardware architecture.

August 01, 2018

Object Oriented Analysis and Design Importance - Comprehensive


Text Book(s)
T1 Larman, C., Applying UML and Patterns, Pearson Education, 3rd Ed., 2005.
T2 Erich Gamma et al., Design Patterns: Elements of Reusable Object-Oriented Software, 1994

https://www.waseian.com/2018/08/object-oriented-analysis-and-design.htmlhttps://www.waseian.com/2018/08/object-oriented-analysis-and-design.html

Question 3)(i)
Online Mobile Recharge
Problem Statement:
   The event study 'Online Mobile Recharge' provides us the data about all the mobile service providers. This application offers us the complete info regarding any mobile service provider in terms of their policies, choices, profits, etc. Supposing, any Airtel buyer needs to have the info of all the schemes and services provided by the company, he/she can have the info and according to his suitability, he can recharge the mobile from the same application. The major benefit of this proposed system is to have the renewing facility of any service provider under the same roof.
End users
Service Provider:
  Service Provider is the one who is nothing but the mobile service provider like all the corporations who are giving the mobile connections come under this module. The functionality of this module is to mark the mobile renewing of their company based on the availability of balance in the admin account. Request derives from the user and it is going to be proved at the admin for the availability of balance and then the appeal is forwarded to the service provided to make the mobile recharge.
Third-party System Administrator:
The administrator is the one who observes all users and user dealings. Admin also observers all the Service Providers, all the user accounts, and amounts salaried by the user and amount salaried to Service providers. When the appeal given by the user admin checks the available balance in the user account then request is advanced to the Service Provider from where user request gets handled. Admin has the complete info related to the user and all the info related to the patterns and other info of dissimilar recharge coupons provided by the Service Providers. All the data is preserved at the Admin level. Admin is having the privileges to limit any user.
User:
  There are 2 categories in the user Module:
Registered User
Visitor
  Any person who wants to use the services of Online Mobile Recharge at any time from anywhere they should get listed in this application. After getting listed user can recharge the mobile at any time and from anywhere. Guest is the one who visits the Online Mobile Recharge application and has the complete data related to the Service Providers and can make the mobile recharge by entering the bank facts or by giving the credit card details.
Draw a Collaboration Diagram for the above case study clearly showing the Classes, Numbering of communications and relationships.
Answer)

IMPORTANT NOTE: In the question, only Collaboration Diagram is asked. Other diagrams are for reference.
Use-Case Diagram
Actors vs Use Cases:
User
•Register.
•Recharge.
•Select Payment Gateway.
•Select Service Provider.
•Make payment.

Third Party Administrator
•Forward User request to Service Provider.
•Track Complaints.

Third Party Server/ Database
•Authenticate the Registered users.
•Maintain the Log.

Service Provider
•Recharge the user requested either directly or through the third party system.
•Provide various plans to the user.

Online Mobile Recharge UML Use case Diagram:


Sequence Diagram
Sequence Diagram for a user to recharge his account through the third-party site:

Collaboration Diagram
Collaboration diagram for a user to recharge his account through the third-party site:

Class Diagram
Classes Identified:
  • User: Registered, Visitor
  • Third Party System Administrator
  • Third Party Server/ Database
  • Service Provider
  • Direct or Non- Third Party User (Direct access through Service Provider Site)

Activity Diagram
Activities:
  • User login and authentication for the Registered user.
  • Forward the request to service provider if logged in as an Administrator.
  • Enter service provider site for a direct user.
  • Enter recharge amount.
  • Select Payment Gateway.
  • Login and authenticate Bank Account.
  • Make payment.
  • Check for the recharge processed successfully or not.
Question3.(ii) 
Draw a UML Activity diagram clearly showing the interaction of Customer, Customer Browser, Google App, Google ACS Service, Identity provider for the Google Apps described below:
 Single Sign-On (SSO) for Google Apps
Purpose:
  An example of a UML activity diagram which describes Single Sign-On (SSO) to Google Apps for customers using some hosted Google application, such as Gmail.
Summary:

When a user tries to use some hosted Google app, such as Gmail, Google generates a (SAML) verification request and sends a redirect request back to the user's browser. Redirect points to the exact identity provider. SAML verification request contains the encrypted URL of the Google application that the user is trying to reach.
  Google acts as a service provider with services such as Gmail or Start Pages. Partner companies work as identifiers and control other passwords, passwords, and passwords used for identifying, authenticating and authorizing users for Google's Web applications. Each partner also provides Google with its SSO URL as well as a public key that Google will use to verify SAML
responses.

Answer: UML Activity diagram:
UML describe the structure, boundary, and behavior of the system as well as objects within it. It's not a programming language but there are tools which can be used to generate code in various languages.

Question4)(i) 
Factory Pattern: Explain through examples.
Answer )
Factory Method is to building objects as Template Method is to applying an algorithm. A super class identifies all standard and common behavior (using pure simulated "placeholders" for creation steps) and then delegates the creation details to sub classes that are delivered by the client.
  Factory Method creates a design more customizable and only a tiny more difficult. Other design patterns need new classes, while the Factory Method only requires a new process. Creating an object often requires difficult processes not appropriate to include within a composing object.
  The factory method design pattern handles these problems by defining a separate method for making the objects, which sub classes can then overrule to specify the derived type of product that will be formed.
Example:
The factory method allows an interface to create objects but the sub-classes can select which groups are immediate. Injection molding presses display this pattern. The manufacturers of plastic toys process plastic molding powder and inject plastic into the mold of desired shapes.
  The class of toy (car, action figure, etc.) is determined by the mold.

Question4)(ia)
How does the Factory Method Pattern fit in with other patterns/methods?
Answer ) 
  Factory classes are useful when you need a complicated process for building the object when the creation need a reliance that you do not want for the real class when you need to construct different objects etc.
  It is a good idea to use factory methods inside the object when:
  • Object's class doesn't know what exact sub-classes it has to create
  • Object's class is designed so that the objects it creates were specified by sub-classes
  • Object's class delegates its duties to auxiliary sub-classes and doesn't know what exact class will take these duties
  • The Factory Method allows these other patterns to defer instantiation to sub classes. One practices the Factory way to defer responsibility to sub class objects. 
Question4)(ib)
Factories increase cohesion. What is their rationale for saying so? 
Answer)
Cohesion means 'sticking together' and a factory is a method, an object, or anything else that is used to instantiate other objects. It helps to keep together both the functionality and the instruction that determines which objects should be built and/or managed under different circumstances.

Question4)(ic) 
Factories also help in testing. In what ways is this true?
Answer )
  • The using objects should act in exactly the similar way with any set of derivatives present. It should not test every possible combination, because it can test each piece individually. No matter how they are combined, the system will work in the same manner.
  • The advantage is that it can yield the same instance multiple times, or can return a subclass rather than an object of that exact type.
Question4.(ii)
Observer Pattern: Explain through examples
Answer)
  • Observer pattern (also called publishing-subscribed pattern) is a behavioral design pattern that defines multiple relationships between objects when an object changes its position, all dependent objects are notified and updates automatically.
  • Observer pattern is the basic standard in decoupling - separating objects which depend on each other.
Example:
  • The observer pattern is used in the model view controller (MVC) architectural pattern. In MVC observer pattern is used to decouple the model from the view. View symbolizes the Observer and the model is the Observable object.
  • Event management - For this scenario, the Observer patterns are widely used. Swing and DotNet are widely fulfilling the events mechanism.
Question4)(iia)
What is the intent of the Observer pattern? Under what circumstances should an Observer pattern not be used?
Answer)
The intent of the Observer Pattern:
  • Defines a one-to-many dependency between objects so that when one object changes state, all its dependents are notified and updated automatically.
  • Is should be used when there is a change of a state in one object must be reflected in another object without keeping the objects tight coupled.
  • It is also used when the framework we are writing needs to be enhanced in the future with new observers with minimal changes.
  • When a change to one object requires the change of a variable number of other objects (not necessarily known at compile-time).
  • When an object should be able to talk to another object, but you don't want them essentially dependent on each other.
Question4)(ii)
One example of the Observer pattern from outside of software is a radio station: It broadcasts its signal; anyone who is interested can tune in and listen when they want to. Give another example from "real-life" with explanation?
Answer)
  • Split-wise group: If anyone adds or updates an entry in the group for any amount- all members of the group get a notification regarding the update done.
  • Facebook: If one follows a post, then it is added to the observer and another comment is received on the same post, send a notification to all other supervisors. It is the same as twitter or any other social media.
  • Cricket Display: The scoreboard display, displays the average score etc information as per the current status of the match. Whenever any total score changes, the display board gets refreshed. So, the display board is the observer here and Subject is the panel sending the current score status to the board.
Question5.(a)
Name three Design Patterns which uses the Singleton design pattern in their Implementation.
Answer)
Singleton design pattern is used in below three implementations.
  • Abstract Factory: In an abstract factory pattern, an interface is responsible for making a factory of related items without clearly specifying its classes.
  • Builder: Builder pattern builds a complex object using simple objects and using a step by step approach.
  • Prototype: This design involves applying a sample interface which speaks to create a replica of the current object. 
Question5.(b)
When singleton pattern usage is unnecessary?
Answer)
  • Most of the time it is not unnecessary.
  • When it's easier to permit an object resource as a reference to the objects that need it, rather than letting objects access the resource globally. 
  • Everything that can be done with a singleton can be done with a class variable or method.
  •  And if something has to be unique, it has to belong to the class and not to the objects.
  • Therefore, good or bad, singletons are conceptually wrong.
Question5.(c)
Compare the operation of Adapter and Bridge Design Patterns.
Answer)
Adapter Design Pattern
  • The adapter makes things work after they're designed.
  • The Adapter pattern is more about getting your existing code to work with a newer system or interface.
  • It is useful to work with two incompatible interfaces.
  • Example: A case of the rd reader which acts as an adapter between the memory card and a laptop. You plugin the memory card into card reader and card reader into the laptop so that memory card can be read via laptop
Bridge Design Pattern
  • Bridge makes them work before they are.
  • The Bridge pattern lets you have another implementations of an algorithm or system.
  • It decouples an abstraction from its implementation and both can differ individually.
  • Example: A circle can be drawn in various different colors using the identical abstract class method but diverse bridge implementer classes.
Question5.(d)
Why Decorator Design Pattern is better than Adapter Design Pattern while handling Interfaces?
Answer)
  • Adapter pattern alters interface, Decorator pattern doesn't alter interface, it just implements the unique object's interface, so that it can be passed to a method, which receives an original object.
  • Adapter delivers a different interface to its subject. Decorator delivers an enhanced interface.
  • An adapter is destined to change the interface of an existing object. Decorator enhances another object without changing its interface. A decorator is thus more transparent to the application than an adapter is. As a significance, Decorator supports recursive composition, which isn't possible with pure Adapters
Question5.(e) 
How does the Façade and Adapter Design Pattern make use of Interfaces?
Answer:
  • Facade describes a new interface, whereas Adapter reprocesses an old interface.
  • Facade design pattern neither translates interfaces nor adds new functionality, instead it just deliver simpler interfaces. So instead of client nonstop accessing single components of a system, it uses facade. Facade design pattern agrees client interact with the difficult system with a much simpler interface and less work
  • The facade will then call individual components.

AUTOMATION TESTING - Technology SELENIUM L1

Question.
While using the accessor commands, if you want to initialize a variable and assign a value onto it, then we could use the
a) store command
b) init command
c) echo command
d) create command

Answer: store command

Question.
Which pane of the Selenium IDE provides the insight of current execution in the usage and form of messages and assists us to debug the concerns in case test case if execution fails?
a) Address Bar Pane
b) Test Case Pane
c) Log Pane
d) Error Pane

Answer: Log Pane

Question.
Each Selenium IDE test step can be divided into 3 components. What are they?
a) Command, Value, and Type
b) Command, Target, and Type
d) Command, Type, and Value
d) Command, Target, and Value

Answer: Command, Target, and Value

Question.
What would be correct selenese command, target and value if we want to enter the value 'steve@gmail.com' for id=email ?
a) Command -> enter Target -> id=email Value -> steve@gmail.com
b) Command -> enter Target -> steve@gmail.com Value -> id=email
c) Command -> type Target -> steve@gmail.com Value -> id=email
d) Command -> type Target -> id=email Value -> steve@gmail.com

Answer: Command -> type Target -> id=email Value -> steve@gmail.com

Question.
Every action/command that we use in the Selenium IDE is internally designed and developed as a function of
a) HTML function
b) Java function
c) XML function
d) JavaScript function

Answer: JavaScript function

Question.
Selenium IDE has a color coding component for reporting purpose. When the execution is done, what color is the test case marked in to signify the successful run of it?
a) Red
b) Purple
c) Blue
d) Green

Answer: Green

Question.
Which option of Selenium IDE can be used to ascertain that the locator value provided in the Target text box is indeed correct and identifies that designated web element on the GUI?
a) Search button
b) Find button
c) Select button
d) View button

Answer: Find button

Question.
Which of the statements describe assertTitle and verifyTitle commands correctly?
a) assertTitle will check that the title value has to be correct. Otherwise, it fails and it ends the play of other added steps. On the other side verifyTitle will check that the title value has to be correct, but if incorrect, it marks it as failed but proceeds with other steps

b) assertTitle or verifyTitle works the same way to check whether the title value is correct. There is no difference between them.

c) assertTitle will check that the title value has to be correct. Otherwise, it marks it as failed but proceeds with other steps. On the other side, verifyTitle will check that the title value has to be correct. Otherwise, it terminates the play of other steps.

d) assertTitle checks whether web page title is correct whereas verifyTitle checks whether the web application title is correct

Answer: assertTitle will check that the title value has to be correct. Otherwise, it fails and it terminates the play of other steps. On the other side verifyTitle will check that the title value has to be correct, but if incorrect, it marks it as failed but proceeds with other steps

Question.
Which of the below is NOT an advantage of automated testing?
a) High ROI
b) Improves accuracy
c) Attended execution
d) Reduces human generated errors

Answer: Attended execution

Question.
Which of the options given is an 'Action' type of command of Selenese?
a) assertTitle
b) click
c) verifyTitle
d) waitFor

Answer: click

Question.
The technique or method used by the Selenium IDE to find and access a particular element on your web page is called
a) Recording technique
b) Playback technique
c) Locator strategy
d) User Experience strategy

Answer: Locator strategy

Question.
Custom made commands that we could create as a JavaScript function can extending the Selenium IDE by adding
a) Parameters
b) User Extension
c) File Logging Extension
d) Firebug Extension

Answer: User Extension

Question.
Which of the below is NOT a feature of Selenium?
a) Commercial
b) Has different products in its suite
c) Supports multiple language implementations
d) Supports multiple platforms

Answer: Commercial

Question.
Selenium IDE is implemented as a plug-in of
a) Google Chrome browser
b) Opera browser
c) Microsoft Edge browser
d) Firefox browser

Answer: Firefox browser

Question.
Which of the statements about the Selenese commands is FALSE?
a) All the selenese commands take mandatorily 2 arguments
b) Commands can be of action type, accessor type or assertion type
c) Commands of Selenium remain internally applied as JavaScript functions
d) Commands help us to perform some test steps

Answer: All the selenese commands take mandatorily 2 arguments

Question.
Which of the possibilities is NOT a dis-advantage of Selenium IDE?
a) No support for iterations and conditional statements
b) No logging capabilities
c) No test script dependencies and grouping possible
d) No database testing

Answer: No logging capabilities

Question.
The Selenium IDE commands can be categorized mainly into 3 categories. What are they?
a) Actions, Renderers, Viewers
b) Actions, Accessors, Assertions
c) Actions, Recorders, Playback
d) Do, view, get

Answer: Actions, Accessors, Assertions

Question.
Which Selenium IDE plug-in is required to enable the tester to save log messages into an external file?
a) Debugger plugin
b) File Logging plugin
c) Log Metrics plugin
d) Error Metrics plugin

Answer: File Logging plugin

Question.
Which of the choices provided is NOT Custom made commands that we might
a) Selenium IDE runs recording and playback feature for ease of creating test cases
b) Selenium IDE delivers several selenese commands to help create test cases easily
c) Selenium IDE needs us to write functionalities in Java or any other programming language
d) Selenium IDE supports logging execution messages to external log files if required

Answer: Selenium IDE requires us to write functionalities in Java or any other programming language

Question.
The Selenese command that can be used to check if the application title is correct or not is
a) assertTitle
b) assertAppTitle
c) assertValidTitle
d) assertValidAppTitle

Answer: assertTitle

Brief about Book Library Data Warehouse System

Topic: BOOK LIBRARY
Subject: DATA WAREHOUSE
Prepared by: Sumit

Q1) Identify the business processes of interest to senior management in the industry (domain) allocated to your group.
Answer)
Major libraries have large collections and circulation. Managing libraries electronically has resulted in the creation and management of large library databases, Library to the students and teachers who are cooperating in this e-learning environment.

Below are some of the business processes of interest to senior management:
  • Variety of Books: Need to better understand what books customers wanted and were willing to pay for. 
  • Fund the Books: Need to change its costs and cash flow so that the book library could continue to operate. 
  • Make Library Reliable: It has to be a library that has its customers to their wanted books on-time.
  • Book Borrowing
A crucial part of a library is the human intermediary the librarian. This intermediary connects the users to the information needed and can assist with advice about using the information retrieval systems and working with information.

Q2) List some questions that would be raised by senior management for improving the business process.
Answer)
There are many questions that can be asked by senior management for improving the above business process.
Some of the questions that will be asked are :
  • When the item was collected?
  • Which librarian registered it?
  • What is the item about?
  • Which branch library the item was registered at?
Q3) To address the above-mentioned questions; propose a DW design (schema diagram).
Answer)
In general for a DW Design basically four main characteristics are used:
Step 1: Identify the Business Process
Step 2: Declare the Grain
Step 3: Identify the Dimensions
Step 4: Identify the Facts

Our Book Library case, the following are steps:
  1. Business Process: Book borrowing is the business process.
  2. Declare the Grain: The second step is to declare the grain of the business process. In the book borrowing process, we declare a transaction issued in library automation system as the grain, which means an item is borrowed by a patron.
  3. Identify the Dimensions: The third step is to choose the dimensions. Dimensions represent how people describe and inspect the data from the process. Following are dimension table I will be using :
    • The Patron-Dimension describes the library patron’s characteristics. The attributes of Patron-Dimension include the name of the patron, gender, occupation, patron type, department, college, and so on.
    • The Item-Dimension describes every item belonging to the library, and its attributes indicating what relating to this item, including call number, title, author, subject, classification, language, location, MARC, collecting source, and so on. 
    • The Location-Dimension describes branch libraries supervised by the city library, and its attributes include the name of the branch library, named of the district it is located and the name of region library.
    • The Date-Dimension describes every hour of one day, and its attributes include hour, date, week, month and year. 
  4. Identify the Facts: The fourth step is to identify the facts. In the case of book borrowing, we identify the fact to measure the number of books borrowed. We declared a transaction that an item was borrowed by a patron as the grain in the prior step. Thus, the number of books borrowed here is equal to one.
  • The star schema is perhaps the simplest data warehouse schema.
  • It is called a star schema because the entity-relationship diagram of this schema resembles a star, with points radiating from a central table. 
  • The center of the star consists of a large fact table and the points of the star are the dimension tables.
Star Schema for Library Book Borrowing:


Q4) List aggregations to improve the DW performance. Justify.
Answer)
  • Aggregates provide improvements in performance because of the significantly smaller number of records.
  • Aggregates allow quick access to Book Dimension data during reporting. Similar to database indexes, they serve to improve performance.
  • Aggregates are particularly useful in the following cases:
    • Executing and navigating in query data leads to delays if you have a group of queries
    • You want to speed up the execution and navigation of a specific query
    • You often use attributes in queries
    • You want to speed up reporting with specific hierarchies by adding a level of a specific hierarchy.
  • Aggregates are particularly useful in the following cases:
  • If the aggregate contains data that is to be evaluated by a query, the query data is read automatically from the aggregate.
  • Query: Total sales for books during the first week of December 2000 for location Mumbai.

Q5) List and justify any 5 metadata items that will be of interest to various stakeholders.
Answer)
  • Metadata means "data about data". 
  • Data that provides information about one or more aspects of metadata data is defined as; It is used to summarize the basic information about the data that can be tracked and can work with specific data.
  • Below are metadata items of various interest to stakeholders:
    • Purpose of the book
    • Time and date of issuing the book
    • Creator or author of the book
    • Location on a computer network where the book was issued.
    • Book quantity
    • Book quality
  • Below are metadata items of various interest to stakeholders:
Types of Meta Data:
  • Descriptive metadata is usually used for search and identification, such as searching and finding an object, such as title, author, topic, keyword, and publisher.
  • Administrative metadata provides information to help manage the source. Administrative metadata refers to the technical information, including file type, or when and how the file was created.
  • Structural metadata describes how components of an object are organized. An example of structural metadata will be how the pages are ordered to make chapters of a book.
Following are some key points that to be included in MetaData:

Definition of data warehouse − It includes the description of the structure of data warehouse. The description is defined by schema, view, hierarchy, derivative data definitions, and data mart locations and materials.

Operational Metadata − It includes currency of data and data lineage. The currency of the data means that the data is active, stored or pure, or not. The genealogy of the data means the history of the migrated data and the changes applied to it.

Business metadata − It has the data ownership information, business definition, and changing policies