But one of the most important considerations is to let the Data Model guide the choice. What does your data look like? Is it more structured or voluminous or what kind of operations would need to be performed (full-text search as an example?)
Relational Databases have been around since ages and have influenced every database in use today. They are good to store data that is highly structured and can scale up well. Relational Databases such as
PostGreSQL,
MySQL are mature and have expressive SQL (Structured Query Language) to access data. They are considered to be rather inflexible although they can avoid duplication but critics of the other type argue that storage has become so cheap that this tradeoff (flexibility vs storage) doesn’t make sense any more.
Document based or Object based NoSQL Databases store info in a single, loosely structured blob of popular data format that do not require fixed schemas, thus providing extreme flexibility for dev. Typically used to store semi-structured data, such a storage mechanism offers great simplicity also providing for massive scalability in read and write. Access varies widely from RESTful APIs such as
CouchDB to MapReduce as in
MongoDB. Flexibility comes at a price as queries are difficult as they’re stored as blobs and will need to be offset by Dev discipline & vigilance including locally designing the schema for each instance and documenting.
Graph Databases offer a great compromise between structured tables and loose entities where nodes are entities connected by edge relationships. Most GraphDBs like
Azure CosmosDB come with a feature-rich set of tools for querying, evaluating and traversing complex networks. Usually used for highly interconnected data, they provide flexibility as well as structure. Usage of Graph Databases is not recommended for simple use cases as that proves to be an overhead. Also, there’s an additional challenge of thinking in graph terms.
Wide-Column Databases like
Cassandra and
HBase are relational but provide the flexibility of having any number of columns and rows, that is highly optimized for data retrieval even for massive amounts of data. These use keyspace instead of schema so have best of both worlds - key-value stores as well as relational in nature. They horizontally scale easily, are simple to explore, easier to update and are good at aggregating queries. However, they are usually slower than Relational and writes are expensive and usually updates in bulk are easier than individual.
Time-Series Databases do one thing and they do it well. They store 2-dimension linear data usually time or and one more value. Applications that have only part of the data as Time-Series might be served well by a Relational database.