25+ [ SURE SHOT ] Data Modeling Interview Questions & Answers
Data Modeling Interview Questions and Answers

25+ [ SURE SHOT ] Data Modeling Interview Questions & Answers

Last updated on 04th Jul 2020, Blog, Interview Questions

About author

Siddharth (Sr Architect - Data Modeling )

He is Possessing 7+ Years Of Experience in Data Modeling. His Passion lies in Developing Entrepreneurs & Activities. Also, Rendered his intelligence to the Enthusiastic JOB Seekers.

(5.0) | 15212 Ratings 4264

Data modeling is an important aspect of designing and structuring databases or information systems to represent and organize data in a meaningful way. In order to give database developers a blueprint, it entails constructing abstract representations of the data and their relationships.This doesn’t worry about implementation specifics; instead, it represents the overall data set from an abstract, high-level perspective. The entities (items or concepts) in the system and their interactions with one another are the main points of emphasis.

1. What is a Data Model?

Ans:

A Data Model is a conceptual representation of a database’s business requirements or physical attributes that help communicate business requirements with clients. Data models are optical displays of an enterprise’s data components and relationships between them.

Data Models

2. How can I define data modelling?

Ans:

The approach that is used to prepare the data model is called Data Modeling. Data modeling is the method of producing the smooth sketch of a software system and the data components it comprises, utilizing text and symbols to show data and workflow.

3. Name possible type of Data Model.

Ans:

Data models are optical displays of an enterprise’s data components and relationships between them. By describing and organising data in terms of the related business methods, data models help to develop proper information systems.

4. What is contained in the physical data model?

Ans:

A physical data model contains the Table, key constraints, columns, unique key, foreign key, default values, indexes, etc. A physical data model refers to the database-particular model that showcases the relational data objects comprising tables, columns, and primary as well as foreign keys along with relations.

5. What is contained in the logical data model?

Ans:

A logical data model refers to the model that does not correspond to a database elaborating things about which the firm wishes to gather data and relation among those objects.

6. Differentiate physical data models from physical data modelling?

Ans:

A physical data model contains the physical attributes of a database. The database performance, physical storage, and indexing strategy are the essential considerations of a physical data model.

7. How will I differentiate logical data models from logical data modelling?

Ans:

A logical data model launches a framework of data components and connections between them. It does not rely on a physical database that delineates the way the information will be implemented.

8. Differentiate physical data model and logical data model?

Ans:

  Aspect Logical Data Model Physical Data Model
Purpose

Objective represents the ideas and requirements of the business.

Exemplifies the management, access, and storage of data.
Abstraction Level High-level abstraction that is centered on business ideas is called abstraction level. Low-level abstraction; addresses specifics of implementation.
Entities and Relationships Describes entities, their characteristics, and their connections Reflects constraints, tables, columns, and indexes.
Normalization

Complies with normalization standards to get rid of duplication.

Denormalization might occur for performance enhancement.
9. What is an entity (Table)?

Ans:

A database consists of multiple rows and columns, called a table. Further, each column has a specific data type, and constraints are set for columns based on conditions. So an entity refers to an object that is present in the actual world.

10. What is the attribute (Column)?

Ans:

A Column is defined as a vertical alignment of data and information stored for a particular column. A column-store database management system refers to the DBMS that indexes every column of table, gathering indexes in line of row data.

11. What is row?

Ans:

A row is the set of tuples or records, or it could be taken as a horizontal arrangement of data. Horizontal arrays are called rows in matrix. Relational databases comprise the tables containing rows and columns.

12. What is a Slowly Changing Dimension?

Ans:

ERD is the logical representation of entities and defines relationships among entities. Entities are given inboxes, and relationships are given by arrows.

13. What is the primary key constraint in the database?

Ans:

The primary key constraint is the set on column to avoid null values or duplicate values. In simple words, columns containing unique items can be defined as the primary key constraint.

14. What is the composite primary key constraint?

Ans:

When more than a single column can be defined as the primary key constraint, it becomes composite. A composite key denotes the various columns for a primary key or foreign key constraint.

15. What is foreign key constraints?

Ans:

The Primary key can be defined for a parent table, and the foreign key is set for the child table. The foreign key constraint always refers to a primary key constraint in the main table.

16. What about surrogate keys?

Ans:

When a numerical attribute is enforced on the primary key in the table, it is called a surrogate key. This could be defined as the substitute for natural keys.

17. Why is composite word added before any key constraint?

Ans:

Keys and Constraints are the protocols that state what data values are permitted in a few data columns.

18. Name a few popular relationships within the data model.

Ans:

These are identifying, non-identifying, and self-recursive relationships in the data model.An identifying relationship is one where an example of a child entity is detected through linkage with the parent entity.

19. What is identifying relationships in a data model?

Ans:

The parent and child table is connected with a thin line. When the referenced column in the child table is a part of the primary key in the parent table, then those relationships are drawn by thick lines and named as identifying relationships in the data model.

20. Is there exist any non-identifying relationship too?

Ans:

In most cases, a parent table and the child table are both connected with a thin line. When a referenced column in a child table is not a part of the primary key in the parent table, then those relationships are drawn by the dotted line, and it is named non-identifying relationships in the data model.

    Subscribe For Free Demo

    [custom_views_post_title]

    21. How will define cardinality in a data model?

    Ans:

    Cardinality is the arithmetic concept. It converts into the count of components present in a set. Cardinalities are used to define the relationships; they could be one-to-one, one-to-many, many-to-many, etc.

    22. What is meant by self-recursive relationships?

    Ans:

    This is a standalone column in the table connected to the same table’s primary key and is named the self-recursive relationship here. Where there is a relation between two given entities of identical type, it is termed the recursive relation.

    23. Explain enterprise data model.

    Ans:

    All -important entities are defined as related to enterprise.. This relationship is defined as the enterprise data modelling. To understand this model in the best areas, should divide the data models into the subject areas.

    24. Define relational data model?

    Ans:

    The relational data model is a visual representation of data objects within a database. The approach used to create the relational data model is called relational data modelling.

    25. What is OLTP data modelling?

    Ans:

    OLTP or Online Transaction Processing is an approach where the data model is specially created for a transaction, and the approach is named OLTP data modelling.

    26. What is conceptual data modelling?

    Ans:

    It will give detailed information related to entities, attributes, or relationships between them. A conceptual data model must be used to state and convey supreme-level connections between the ideas and entities.

    27. What is constraint?

    Ans:

    This is a rule imposed on data. Different types of the constraints could be unique, null values, foreign keys, composite keys etc/.A database requires the constraints to be trustworthy and of supreme quality.

    28. Define unique constraints for the database.

    Ans:

    This constraint is added to avoid the duplicate values within a column. Unique constraints refer to rules that values of key are authentic only when they are distinct.

    29. Define check constraint.

    Ans:

    Check constraint is helpful in defining a range of values within a column. It is a rule that denotes values that are permitted in a single or more column of each row of the base table. A check constraint can be defined to guarantee that values in the column comprising ages are positive numbers.

    30. What is the index in the database?

    Ans:

    An Index is composed of the set of columns or a single column that is needed for a fast retrieval of data. Index is a replica of chosen columns of data, that is created to allow the proper search.

    Course Curriculum

    Learn Data Modeling Training to Advance Your Career with Expert-led Course

    • Instructor-led Sessions
    • Real-life Case Studies
    • Assignments
    Explore Curriculum

    31. What is the sequence?

    Ans:

    A sequence could be defined as a database object needed to create a unique number. Sequences are meant for the task of producing distinct key values.

    32. How will database normalisation be?

    Ans:

    To simplify data based on standard rules, database normalisation is needed. It is a way of arranging data into the tabular form in a way so that outcomes of utilising the database are always clear and as desired.

    33. What do data modellers use normalisation for?

    Ans:

    • Eliminate the unnecessary or redundant data
    • Reduce a data complexity
    • Ensure the relationships between tables in addition to data within a table
    • Ensure data dependencies so that data is stored logically.

    34. What is denormalization?

    Ans:

    Denormalization is a technique of adding redundant data to an already normalised database. This step improves the read performance at the expense of write performance.

    35. What is Metadata?

    Ans:

    Metadata describes data about data. Shows what data is actually stored in a database system. It is data offering information regarding one or more aspects related to data.

    36. What is data mart?

    Ans:

    The data mart is the summary version of a data warehouse designed to be used by specific departments, units, or user groups within an organization.

    37. Mention a few OLTP System examples.

    Ans:

    • Send SMS
    • Add Book to Shopping Cart
    • Online Ticket Booking
    • Online Banking
    • Order Entry

    38. What are types of normalisation ?

    Ans:

    • First normal form
    • Second normal form
    • Third normal forms
    • Boyce-Codd fourth
    • Fifth normal form.

    39. What is Forward Data Engineering?

    Ans:

    Forward engineering is the technical term used to automatically transform the logical model into a physical working device. It is a way of producing a database schema from physical structure.

    40. What is PDAP?

    Ans:

    Data modelling refers to a way to build a visual display of either the entire information system or segments of it to convey interactions between the data points and framework

    41. What is a Snowflake Schema Database Design Description?

    Ans:

    A snowflake schema is an arrangement of dimension and fact tables. Both the tables are typically further divided into separate dimension tables. It is a multi-dimensional data structure that is an augmentation of star schema, where dimension tables are fragmented into the subdimensions.

    42. What is sequence clustering algorithm?

    Ans:

    A sequence clustering algorithm collects the set of data containing paths and events that are similar or related to each other. It is a special kind of algorithm that blends sequence analysis with the clustering.

    43. What are discrete and continuous data?

    Ans:

    Discrete data is a finite or defined data. B. Gender, phone number. Continuous data is data that changes continuously and regularly, for example, Age.

    44. What are Time Series Algorithms?

    Ans:

    Time Series Algorithms are methods of predicting continuous values ​​of data in the table. For example, employee performance can predict the benefits and impact.

    45. What is Business Intelligence?

    Ans:

    BI (Business Intelligence) is a set of processes, architecture, and technologies that transform raw data into the meaningful information that drives meaningful business action.

    46. What is data warehousing?

    Ans:

    • Data warehousing is a process of collecting and managing data from a variety of sources.
    • It provides meaningful insights into enterprise. Data warehousing is typically used to connect and analyze data from the disparate sources.

    47. What is the junk dimension?

    Ans:

    • A junk dimension combines the two or more related cardinalities into a single dimension.
    • These are typically boolean or flag values.
    • It is an easy way of categorising specifically low-cardinality flags as well as the indicators.

    48. Describes frequency of data collection.

    Ans:

    The frequency of data collection phases are:

    • extraction from the various sources
    • transformation
    • cleaning
    • preservation.

    49. What is database cardinality?

    Ans:

    Cardinality is the numeric attribute of the relationship between two entities or an entity set. Database cardinality means a range of components organised in tables and rows.

    50. What are different types of cardinal relationships?

    Ans:

    • One-to-one relationship
    • One-to-many relationship
    • Many-to-one relationship
    • Many-to-many relationship
    Course Curriculum

    Join Data Modeling Course with Global Recognised Certification

    Weekday / Weekend BatchesSee Batch Details

    51. What is granularity?

    Ans:

    Granularity represents a level of information stored in the database table. To make it high or low, you can use a table that contains the transaction-level table and a fact table.

    52. What does data sparsity mean?

    Ans:

    Data sparseness refers to a number of empty cells in a database. It represents the amount of data available in the particular dimension of the data model.

      53. What is hashing?

      Ans:

      hashes are helpful for searching the index values ​​to get data needed. It is used to calculate the direct location of the data using an index. It is a way to convert a specific key or a string of characters into a different value and integer.

      54. What is OLAP?

      Ans:

      • OLAP stands for On-line-Analytical processing and is designed to help the managers, executives, and analysts gain insights faster, more confidently, more consistently, and more interactively.
      • It is the type of technology that makes an OLAP used in intelligent solutions, including the planning, budgeting, analysis, forecasting, simulation models, and more.

      55. What is a recursive relationship?

      Ans:

      A recursive relationship occurs when there is a relationship between the entity and itself. These relationships are complex and require a more sophisticated approach to mapping data to the schemas.

      56. Mention common errors faced in Data Modeling.

      Ans:

      Creating a huge data model: When tables are run greater than 200, the data model gets more intricate, thereby raising the probability of failure.

      Redundant surrogate keys: Surrogate keys should be utilised when the natural key is incapable of performing the role of primary key.

      The objective is absent: Circumstances might come where the user has no idea regarding business targets and objectives.

      57. What are subtype and supertype entities?

      Ans:

      It is possible to break down the entities into various sub-entities or categorise them by a certain attribute. Every sub-entity has valid features and is referred to as a subtype entity.

      58. Why NoSQL databases are more beneficial compared to Relational Databases?

      Ans:

      • They can gather structured, semi-structured, and unstructured data.
      • They possess the dynamic schema, which implies they can transform rapidly as per requirements.
      • They provide the failover and enhanced recovery alternatives.

      59. What about conformed dimensions?

      Ans:

      Conformed dimensions refer to dimensions that carry the same meaning and values over the various fact tables and subject areas. A conformed dimension related to time can possess the same features and stages of granularity for the sales, and inventory.

      60. What is meant by data mining?

      Ans:

      Data mining refers to the multi-disciplinary potential that utilises machine learning, statistics, Artificial Intelligence, and database technology. It refers to inventing previously unrecognised connections existing among data.

      61. Explain predictive modelling analytics.

      Ans:

      The method of authenticating and examining models which would be utilised to forecast testing and authentication results. It can be utilised for machine learning, AI, and also statistics.

      62. What is hierarchical DBMS?

      Ans:

      Hierarchical DBMS refers to the database where model data is arranged in tree-like pattern. Data is gathered in hierarchical form. Data is shown utilising the parent-child connection.

      63. Mention disadvantages of hierarchical data model.

      Ans:

      • It is not adaptable since it consumes time to get accustomed to changing requirements of a business.
      • The model poses problems in inter-departmental interaction, vertical interaction, and inter-agency interaction.
      • Hierarchical data models can produce the issues of disunity.

      64. What is a process-driven approach to data modelling?

      Ans:

      The process-driven approach utilised in the data modelling adheres to stepwise technique on connection between the entity-connection model and organisational method.

      65. Mention benefits of data modelling

      Ans:

      • It allows handling business data by normalising it and stating features.
      • Data modelling unites data of different systems to decrease a data overabundance.
      • It allows us to produce great database design.
      • Data modelling allows the organisation section to work coherently as a team.

      66. What is UML?

      Ans:

      UML stands for Unified modelling language it refers to the usual purpose, database development, and modelling language in the domain of software engineering.

      67. What is a network model?

      Ans:

      The network model refers to the model that is created on a hierarchical model. It facilitates greater than one connection to link records, which shows it contains the various records.

      68. What are compound keys?

      Ans:

      When greater than one field is used to display the key, it is known as compound key. A compound key resembles the composite key where two or more fields are required to produce distinguished value.

      69. What is the benefit of using keys?

      Ans:

      • Keys allow to detect any row of data existing in the table. A table can comprise millions of records in real-life.
      • Keys also guarantee that they can detect a table record overcoming challenges.
      • Keys let find a relation between and detect the same between tables.
      Data Modeling Sample Resumes! Download & Edit, Get Noticed by Top Employers! Download

      70. What is an alternate key?

      Ans:

      An alternate key refers to the column or set of columns existing in the table that simply detects each row present in the table. A table can contain the various options for the primary key, however, only a single one can be made the primary key.

      71. What are features of DBMS?

      Ans:

      • Offers protection and eliminates overabundance.
      • Self-explaining a character the r of database system.
      • Detachment between the data abstraction and programs.
      • Assists with the various visualisations of data.

      72. What is RDBMS?

      Ans:

      RDBMS stands for Relational Database Management System. It refers to the software that is meant to gather data in a pattern of tables. Here, data is handled and gathered in the rows and columns, which are called tuples and attributes.

      73. Mention cons of Data Model.

      Ans:

      • To create a Data model, you must know physical data gathered attributes.
      • This is a navigational way that creates intricate application development, and management.
      • Even tiny modifications made in a model need an alteration in the whole application.
      • There is no range of data manipulation language present in the DBMS.

      74. What is chained data replication?

      Ans:

      The circumstance where a secondary node chooses the goal utilising ping time or when the nearest node is secondary one, is termed as chained data replication.

      75. What is Virtual Data Warehousing?

      Ans:

      A virtual data warehouse offers the accumulated visualisation of entire data. A virtual data warehouse does not include the previous data and it is held as a logical data model containing a metadata.

      76. What is a snapshot in a data warehouse?

      Ans:

      Snapshot refers to the entire visualisation of data at a moment when the data extraction method starts. A snapshot can be utilised to record the activities.

      77. What is bi-directional extract?

      Ans:

      The power of a system to retrieve, clean, and store data in two ways is known as bi-directional extract. Bi-directional extract offers the upgrades that allow the process of a data loading to speed up.

      78. What are data model repositories?

      Ans:

      Data models and related data such as entity definition, feature definition, columns, and data types, are gathered in a repository, which can be easily accessed by the data modellers and the whole team.

      79. What is the star flake schema?

      Ans:

      In star flake schema you need to enter the required facts and primary keys of dimensional tables in the Fact table. Fact tables are the unification of the entire dimension table key.

      80. What are disadvantages of recursive relationships and how will they solve them?

      Ans:

      A Recursive relationship happens when there is connection between the entity and itself. For instance, a single-to-many recursive relation happens when a worker is manager of another worker.

      81. Mention kinds of dimensions in data modelling.

      Ans:

      Conformed dimension: These are utilised in the various tables in the database by letting measures and facts be classified and elaborated over data marts.

      Junk dimension: The features do not correspond to a fact table or any present dimension tables and they are generally text or flags showing Y or N.

      Role-playing dimension: There are various connections valid between itself and different tables.

      Degenerated dimension: These do not contain dimension tables but are segments of the primary key.

      82. What are factless fact tables in data modelling?

      Ans:

      A factless fact table refers to the table without facts. They just contain dimensional keys, and they grab events that happen at the information stage and not at the computation stage.

      83. Describe Critical Success Factor.

      Ans:

      A critical success factor (CSF) refers to a specific aspect or component that a group, department, or firm must properly use and intensify to suffice strategic goals.

      84. What is dimensional modelling?

      Ans:

      Dimensional Modelling refers to the data model style that is especially helpful for data storage in a warehouse. The target of dimensional modelling is to create the database in a more intense way to fetch data.

      85. What is the CAP theorem?

      Ans:

      The CAP theorem displays that no distributed system can guarantee C, A, and P simultaneously. In other words, it denotes that a distributed system cannot offer more than two assurances.

      86. Explain Amazon’s RDBMS service.

      Ans:

      The Amazon RDBMS service refers to a web service that is meant for arranging, tackling, and scaling relational databases in cloud systems. Standard relational database engines are assisted by the Amazon RDS, which is handled, stretched, and accessible on a demand.

      87. What is Amazon Aurora?

      Ans:

      Amazon Aurora is a high-availability, computed failover relations; database engine that assists the MySQL and PostgreSQL. Amazon Aurora is the mixture of MySQL and Postgres.

      88. What is Data Skew?

      Ans:

      While cooperating with the huge businesses with greater than 10,000 records, and can come over the problem of a data skew. Professionals mention a condition called ownership data skew when a one person possesses the various records.

      89. What is Snowflake’s data retention period?

      Ans:

      The data retention period is a prime component of Snowflake Time Travel. Snowflake preserves the earlier state of data in a tabular form when it is altered, such as when data is removed or objects comprising data are eradicated.

      90. What is the Power Pivot data model?

      Ans:

      The connections launched between the various data tables to properly organise data are called data models. There are certain data types, columns, associations, and tables available in the data model.

      Are you looking training with Right Jobs?

      Contact Us
      Get Training Quote for Free