Day 1 – Database Types and When to Use Them

Today's Focus

Understand the landscape of database technologies. Not every problem needs the same database — the choice depends on the shape of the data, the access patterns, and the consistency requirements.

What is a Database?

A database is a structured system for storing, querying, and updating data that persists beyond a single program run. Without a database, data lives only in memory and is lost when the process exits.

A web API that stores its data in a list variable will lose all data every time the server restarts. A database solves this by writing data to disk (or keeping it in managed memory with persistence) so it survives restarts, crashes, and deployments.

The Main Types

Relational Databases (SQL)

Data is organised into tables with rows and columns. Every row conforms to a fixed schema. Relationships between tables are expressed with foreign keys. SQL (Structured Query Language) is used to query and manipulate data.

Property	Value
Data shape	Tables with fixed schemas
Query language	SQL
Consistency	ACID transactions
Best for	Structured data with clear relationships: users, orders, products, transactions
Examples	PostgreSQL, MySQL, SQLite, SQL Server

Document Databases

Data is stored as documents (JSON-like objects). There is no fixed schema — different documents in the same collection can have different fields. Good for data that naturally varies in shape.

Property	Value
Data shape	JSON documents in collections
Query language	Language-specific query API
Consistency	Varies; typically eventual
Best for	Content, product catalogues, user profiles, event logs
Examples	MongoDB, CouchDB, Firestore

Key-Value Stores

The simplest model: a key maps to a value. Extremely fast, usually in-memory. No complex querying — you look things up by key.

Property	Value
Data shape	Key → value
Query language	GET / SET commands
Consistency	Usually eventual
Best for	Caching, sessions, rate limiting, leaderboards, pub/sub
Examples	Redis, Memcached, DynamoDB (can be used this way)

Time-Series Databases

Optimised for data recorded over time: metrics, sensor readings, logs. The data model revolves around timestamps, and queries aggregate over time ranges. Examples: InfluxDB, TimescaleDB, Prometheus. Not covered in depth this week, but worth knowing they exist.

Graph Databases

Optimised for data where relationships are as important as the data itself. Nodes represent entities; edges represent relationships. Used for social networks, recommendation engines, and fraud detection. Examples: Neo4j, Amazon Neptune. Not covered this week.

The CAP Theorem

Distributed databases can guarantee at most two of three properties:

Consistency — every read gets the latest write
Availability — every request receives a response (not necessarily the latest data)
Partition tolerance — the system keeps working despite network splits between nodes

Partition tolerance is not optional in practice — networks do fail. So the real trade-off is between consistency and availability when a partition occurs.

Relational databases typically prioritise CP (consistency and partition tolerance). Many NoSQL databases choose AP (availability and partition tolerance), accepting that reads may return slightly stale data.

Choosing a Database

A practical decision guide:

Need to query across multiple relationships? → Relational
Data shape varies document to document? → Document
Need sub-millisecond reads, caching, or ephemeral data? → Key-value
Storing time-series metrics? → Time-series
Complex relationships between entities matter more than the entities themselves? → Graph
Default choice for a new project with unknown access patterns: PostgreSQL

Most production systems use more than one type. PostgreSQL for transactional data, Redis as a cache in front of it, and MongoDB for flexible content are a common combination.

Key Concepts

Term	Definition
RDBMS	Relational Database Management System — the software that manages a relational database (e.g. PostgreSQL)
SQL	Structured Query Language — the standard language for querying relational databases
Schema	The defined structure of a table or database: column names, types, and constraints
Collection	The MongoDB equivalent of a table — a group of documents
Document	A JSON-like record stored in a document database
Key-value	The simplest database model: a unique key maps to a single value
ACID	Atomicity, Consistency, Isolation, Durability — the guarantees of a relational transaction
CAP theorem	A distributed systems theorem: a database can guarantee at most two of Consistency, Availability, and Partition tolerance
ORM	Object-Relational Mapper — a library that maps database rows to objects in code (e.g. SQLAlchemy, GORM, Entity Framework)

Tasks

Research one real-world product that uses each database type and explain why it was chosen. For example: Instagram uses PostgreSQL; Redis is used for caching at almost every large web company; MongoDB is used by many content-heavy platforms. Look for engineering blog posts that explain the decision.
Visit the documentation home pages for SQLite, PostgreSQL, MongoDB, and Redis. Find the "getting started" guide for each. Note how each one describes its data model differently. What does PostgreSQL call a database? What does MongoDB call the equivalent?
Sketch on paper (or in a text file) a data model for a simple blog with posts, authors, comments, and tags. Think through:
- How would you represent this in relational tables? What are the tables, primary keys, and foreign keys?
- How would you represent a single post (with its author, comments, and tags) as a MongoDB document?
- What data would you cache in Redis, and what TTL would be appropriate?

Reading / Reference

PostgreSQL: About — how PostgreSQL describes itself and its strengths
MongoDB: What is a Document Database?
Redis: Data Types — the range of data structures Redis supports
Martin Fowler: NoSQL Distilled — a concise summary of the NoSQL landscape

Academy for Developers