ACL Digital

Home / Blogs / Decoding Databases
March 20, 2024
9 Minutes read

Decoding Databases

In today’s digital world, data reigns supreme. It’s the lifeblood of the information age, a vast treasure trove of information and facts. Data is the key to smoothly functioning everything from government to local organizations. A company’s success vastly depends on how well they handle and utilize its data. Here is where the database comes into the picture because you need the database to store data in any form, such as video, file, or plain text.

The database can be considered as a container filled with data or information that is electronically stored in a computer system. Data in any form can be stored in the database and can be easily accessed, modified, protected, and analyzed. Thus, the database is defined as the collection of related data or information that is stored and organized so that it can be easily accessed.

Evolution of DB

The first type of database was the flat-file database, and the data was stored in simple files like CSV files or fixed-length files, etc. Later, the hierarchical database emerged, and then the network database. Both databases stored data through parent-child relationships but could not store complex data relationships. Hence, it will soon be replaced by relational databases and non-relational databases. 

DBMS

A Database Management System (DBMS) is software that stores, organizes, and retrieves data in a database. It has built-in programs to process data retrieval and post requests. Thus, database and DBMS are not the same.

MySQL, SQL Server, PostgreSQL, Ne04j, Cassandra, MongoDB, and Oracle are some examples of DBMS.

Types of Databases

Data organization is crucial in today’s digital world. Databases provide a structured approach to store and manage information. This section explores two main database categories: Relational and Non-Relational databases.

Relational Database

A Relational database is the collection of data with a predefined relationship between the collections. These items are stored as sets of tables with rows and columns. Each column in a table defines the data type of fields for the record, whereas each row represents the record, which is called an entity. Each row in a table is marked with a unique identifier, which is called a primary key, and this primary key is used by other table record columns as the foreign key.

A unique programming query language, Structured Query Language (SQL), retrieves data in relational DBs. The most used DBMSs (R-DBMSs) are MySQL, MS SQL Server, Postgres, MariaDB, Oracle, etc.

SQL databases also comply with ACID (Atomicity Consistency Isolation and Durability). Atomicity ensures that the transaction is successfully executed, or if a part fails, the entire transaction is invalidated. When writing data to a database, consistency is crucial to ensure that the data adheres to all defined rules, restrictions, constraints, cascades, and triggers that have been established. Maintaining isolation is also critical to achieving concurrency control and guaranteeing that each transaction remains independent. Finally, durability is essential, as all changes made to the database must be permanent once a transaction has been completed. In simple terms, data validity is guaranteed whenever there is a transaction in the database, even if there are network or hardware failures, which are essential things like banks and financial institutions. 

The key point in SQL databases is that it organizes data in their most diminutive standard form. However, a potential drawback here is that it requires a schema at the front. Working with SQL can be more challenging if one needs to learn the correct way to shape up front. Hence, this type of database could be better for unstructured data. And as the schema is pre-defined, it becomes difficult to scale the table horizontally, it will take some effort and change in the schema which needs to be carefully planned. Databases like Cockroach DB help us to scale the DB horizontally.

Non-Relational Database

Unlike a relational database, which stores data in a table and rows, a non-relational database uses a storage model optimized depending on the type of data to be stored. These databases are often called NoSQL databases. There are several categories of databases under non-relational databases. It includes:

  • Key-Value Database
  • Wide Column Database
  • Document Database
  • Graph Database
  • Search Engine Database
  • Multi-Modal Database and so on.

Each database stores data differently and helps manage a specific data type.

Key-Value Database

The key-value database, also called a key-value store, is the most straightforward non-relational database; the name suggests, every data stored in this database will be assigned to a key. To store data, you provided the key and the blob of data such as an image, text file, JSON object, etc. Once saved, give the key to retrieve the data.

Key-value databases are handy for storing certain types of data, such as configuration, state information, or any data that a dictionary, JSON, or a hash in a programming language might represent. Key-value stores are commonly used for machines in-memory data caching to speed up applications by minimizing reads and writes to slower disk-based systems, but this limits the amount of data you can store.

One cannot perform queries with the key-value store, so the data modeling options are minimal. So, the key-value pair makes the DB significantly faster by trading off space and querying. This database reduces data latency but cannot be used as a main database for the application. However, it can still be used in the application for caching, pub/sub, and leaderboard kind of stuff. The commonly used key-value DBs are Redis and Memcached.

In real-time GitHub, Twitter and Snapchat use this database to enhance the performance and speed of their services. 

Wide Column Database

The commonly used Wide Column databases are Cassandra and Apache HBase. It is sometimes called a Column family database. This database overcomes the limitation of the key-value store by adding a second dimension to the array. 

This DB also stores data in rows and columns but not in a table like a relational database; instead of tables, they use structures called column families. Column families contain rows of data, each with its structure or schema. Each row comprises a unique row identifier and sets of column names and values. Each row can have a different number of columns and other data types. Since it is a relational database without a defined schema, the data can be accessed by a querying language called CQL. Unlike a relational database, the Wide Column Database can be scaled horizontally. 

This database is an ideal time-series, historical record, and application where data read is low and write is high like records from IoT devices, and sensors. In real-time, this database is used on Netflix to store the different shows users watch.

Document Database

If frequent reads are essential for the application, then the Document Database is preferred over the Wide Column Database. Document databases, otherwise called document stores, store data as key-value pairs and use a unique key to identify data stored in the database. Unlike key-value DB, document databases store data in a structured format called documents, which often use JSON, VSAN, or XML format.

Though each document within this database has structured data, this DB does not require a schema, which means there is no specific format for all documents. Each document can have its structure, which the database understands, and can be grouped into collections. These collections can be indexed and can be organized into a logical hierarchy. 

These key-value documents can be queried and analyzed, but these  document DB queries don’t support Join. So, instead of normalizing data into a small part, this DB stores every required data in a single document. It creates a trade-off, where reading from the front-end application is much faster; however, writing or updating data tends to be more complex.

Document DBs are very easy to use from a developer’s perspective, and they are suitable for any applications where reading is more often used, and data structure is unknown. The most used Document DBs are MongoDB, Firebase Firestore, Dynamo DB, and Couch DB.

Graph Database

Graph Database follows different approaches to form relationships between data. Unlike a relational database that uses tables and foreign keys to map the relations between data, a Graph database forms relations using nodes and properties. Data is represented through nodes, and each node can have multiple properties. Between these nodes, edges or relationships represent different types of connections. In simple terms, the relationship itself is data. 

Unlike SQL DB, the foreign key and join are not required to define the relationship; they represent an edge and connect it with other records. Thus, the middleman join table can be removed entirely in graph DB, and querying data with a statement is much more concise and readable.

A graph database is ideal for applications with data where relationships or connections between data are most important. A commonly used DBMS is Neo4j DB.

Conclusion

Both relational and non-relational databases have their own merits and demerits. We need to choose the correct database(s) for our application based on the need. Otherwise, it will be like going to a gunfight with a sword. 

What’s in Store

Not all NoSQL databases are covered in this blog. Other essential database models and DBMS to be noted are Time-series Database, Multimodal Database with Fauna DB and GraphQL, Data Warehouse, Data mining, and centralized and distributed DB; this list does not end here.

References

Turn Disruption into Opportunity. Catalyze Your Potential and Drive Excellence with
ACL Digital.

Scroll to Top