What is a data modelling schema | Types of Schemas
A data modelling schema is similar to a relational schema, i.e., a formal database specification. It is the database’s representation of data in a database; similarly, a relational schema is a standard specification of a relational database.
For example, an address book may be modelled with the database tables “people”, “phones”, “addresses”, and “name_value” columns for each, along with appropriate foreign key references between them. This is a data modelling schema of the database, although it is often referred to as “the schema”.
This is often in the form of a graphic diagram, but other means of communication are also desirable; people who are not in the computer field may or may not understand diagrams and graphics.
The data modelling schema contains information about the types of data you are using, the variables that are used, and any additional metadata you want to attach to the dataset. It is possible to create a data modelling schema without using data, such as when you create a model, but it is useful to develop a data modelling schema for your data sources because it allows you to use the metadata available to you in the data source.
Types of Schema
There are two major types of schema; logical and physical.
Similar to a DDL script, a logical schema defines a logical representation of an analytical data model. It does not have to contain all of the schema’s physical specifics. Typically, it is specified using a logical data modelling language such as ERD (Entity Relationship Diagram), MSL (Modified Structured Query Language), or UML (Unified Modeling Language) (Unified Modeling Language). A logical schema is the data model that you construct as part of your conceptual or logical database architecture.
A physical schema contains all of the schema’s physical details. It typically contains all of the DDL necessary to build a physical implementation of the logical schema, including the database engine used, the database table structure, primary keys, indexes, foreign keys, primary key constraints, relationships between tables, constraints, tablespace information, and storage format.
In a project, both a logical and physical schema might be used. If you are aware of all the physical details of the schema and do not require logical representation, a physical schema can be developed. If you do not know all of the physical specifics of the schema but are working with a logical design, you can develop a logical schema.
While both logical and physical schemas can be established using any database system, a logical schema is more effective when your data model is still in its infancy.
A schema can be built at any point during the project’s lifecycle. You can develop a physical schema before to creating a data model, or you can develop a logical schema after developing a data model. The benefit of a physical schema is that it contains all of the schema’s physical information. The physical data can be used to develop a script that can be invoked to construct the physical implementation.
Manually creating a data modelling schema
When you create a model manually, you create a data modelling schema for your data before creating a model. If you create a data modelling schema before creating a model, you can attach metadata from your data sources, and that metadata will be available in your model when you create a model with that data source.
ER Diagram as Data Modelling Tool
The ER diagram is a graphic tool that facilitates data modelling. ER diagrams are a subset of “semantic models” in database parlance. Semantic models refer to models that intend to elicit meaning from data. ER diagrams are not the only semantic modelling tools but are common and popular.
When we begin to discuss the contents of a database, the data model helps us to decide which piece of data goes with which other piece of data on a conceptual level. An early concept concerning databases is to recognise that there are levels of abstraction we can use in discussing databases.
For example, if we were to discuss the filing of “names,” we could discuss (a) abstractly, that is, “We will file names of people we know”; or (b) concretely, that is, “We will file first, middle, and last names (20 characters each) of people we know, so that we can retrieve the names in alphabetical order by last name, and we will put this data in a spreadsheet format in package x.” If a person is designing a database, the first step is to abstract, and then refine the abstraction.
The longer one stays away from the concrete details of logical models (relational, hierarchical, network) and physical realisations (fields [how many characters, the data type…. and files (relative, spread-sheet, … the easier it is to change the model and to decide how the data will eventually be physically realised (stored). When we use the terms “field” or file, we will refer to physical data instead of conceptual data.
Mapping is the process of choosing a logical model and then moving to a physical database file system from a conceptual model (the ER diagram). A physical file loaded with data is necessary actually to obtain data from a database. Mapping is the bridge between the design concept and physical reality. This book concentrates on the relational database model due to its ubiquitousness in contemporary database models.
So, What Is an Entity Relationship Diagram?
The ER diagram is a semantic data modelling tool used to accomplish the goal of abstractly describing or portraying data. Abstractly described data is called a conceptual model. Our conceptual model will lead us to a “schema.”
A schema implies a permanently fixed description of the structure of the data. Therefore, we can call it a schema when we agree that we have captured the correct depiction of reality within our conceptual model, our ER diagram.
An ER diagram could also be used to document an existing database by reverse engineering it. Still, in introducing the subject, we focus on the idea of using an ER diagram to model a to-be-created database, and we deal with reverse engineering in further discussion.
Conclusion
The schema of a database is merely a method of describing the data; it is not the data itself. It could be represented in an in-memory object such as an array or hash, but this would be inefficient for performance reasons.