Database - Marcus Noll

## Relational database Also often referred to as SQL database. It stores structured data in tables. Tables can be connected with each other. One must define the schema of those tables before actually storing data. SQL is also the name of the language for querying the data. Relational databases are widely adopted. SQL as a query language is easy to learn and a lot of engineers know it well. A drawback of relational databases is that they can become expensive with massive amounts of data. Usually compliant with the [ACID properties](https://en.wikipedia.org/wiki/ACID). ## Non-relational database Also often referred to as NoSQL ("not-only SQL") database. It stores unstructured data (which does not fit into rows and columns as in a [[#Relational]]) database. No need to define the schema beforehand. Querying the data happens through custom query languages, sometimes also through SQL. Non-relational databases are flexible and scalable, but can become hard to maintain with a growing amount of data. Also, having to learn different query languages can become a burden. They follow the [BASE principle](https://stackoverflow.com/questions/3342497/explanation-of-base-terminology). Only some offer compliance with the [ACID properties](https://en.wikipedia.org/wiki/ACID). ### Types of non-relational databases #### Key-value database Examples: [[AWS products#Amazon DynamoDB]] and [Redis](https://redis.com/) Stores data stored as key-value pairs. Each key must be unique. The value can be anything from a number or string to a complex object. #### Document database Examples: [[MongoDB]] Stores data as hierarchical structure in documents, e.g. in JSON, XML or YAML. #### Graph database Examples: [Neo4j](https://neo4j.com) Stores data as nodes and relationships between them. ## Terminology ### Normalization > Database normalization is the process of structuring a relational database in accordance with a series of so-called normal forms in order to reduce data redundancy and improve data integrity. > -- https://en.wikipedia.org/wiki/Database_normalization ### Denormalization > Denormalization is a strategy used on a previously-normalized database to increase performance. > -- https://en.wikipedia.org/wiki/Denormalization ### Upsert The term "upsert" is a blend of **update** and **insert**. It refers to a database operation that changes/inserts data as follows: - If the specified data already exists, it gets updated with the specified value - If the specified data does not yet exist, the operation inserts the new data with the specified value ### Index An index could speed up data retrieval, especially when the values of the indexed field have low variety. Keep in mind that having an index in place can negatively impact the performance of create, update and delete operations, since each of these has to update the index. An index also needs storage.