How to Design a Database Schema for a Web Application: A Practical Walkthrough

If you’re building a web application, the database schema is the foundation everything else rests on. Get it right, and your app scales gracefully. Get it wrong, and you’ll be paying technical debt for years. In this hands-on guide, we’ll walk through how to design a database schema for a real SaaS web application using PostgreSQL, covering entities, relationships, indexes, normalization, and the pitfalls we see most often at Coding4.

Unlike generic tutorials, this article uses a concrete example: a multi-tenant SaaS project management tool. You’ll get actual SQL you can adapt, not just theory.

What Is a Database Schema?

A database schema is the blueprint that defines how your data is organized: the tables, columns, data types, constraints, and relationships between entities. A good schema enforces data integrity at the database level rather than relying solely on application code, which is the single most reliable way to keep your data clean over time.

database schema diagram

The 7 Steps to Design a Database Schema

Before writing a single CREATE TABLE, follow this process:

  1. Understand the business requirements. What does the application actually need to do?
  2. Identify entities. Nouns in your requirements usually become tables (User, Project, Task).
  3. Define attributes. What properties does each entity have?
  4. Map relationships. One-to-one, one-to-many, many-to-many.
  5. Apply normalization. Eliminate redundancy (usually up to 3NF).
  6. Add constraints and indexes. Primary keys, foreign keys, unique constraints, and performance indexes.
  7. Review and iterate. Validate the schema against real query patterns.

Step 1: Defining the Scope of Our Sample SaaS App

Our example is a project management SaaS where:

  • Organizations sign up and have multiple users (multi-tenancy).
  • Each organization has projects.
  • Projects contain tasks assigned to users.
  • Tasks have comments and tags.
  • We track audit information (created_at, updated_at) on everything.

Step 2: Identifying Entities and Relationships

From the requirements above, we can extract these core entities:

Entity Purpose Key Relationships
organizations Tenant root Has many users, projects
users Authenticated accounts Belongs to organization
projects Work containers Belongs to organization, has many tasks
tasks Units of work Belongs to project, assigned to user
comments Discussion on tasks Belongs to task and user
tags Labels for tasks Many-to-many with tasks
database schema diagram

Step 3: Normalization in Practice

Normalization is the process of structuring tables so that each piece of data lives in exactly one place. For most web apps, aiming for Third Normal Form (3NF) is the sweet spot.

The three rules in plain English:

  • 1NF: Every column holds one value (no arrays-as-strings, no comma-separated lists).
  • 2NF: Every non-key column depends on the whole primary key.
  • 3NF: No column depends on another non-key column.

When to denormalize: Only when you have measured a real performance problem. Premature denormalization is one of the most common schema mistakes.

Step 4: The Complete Sample Schema in PostgreSQL

Here is the full schema with proper types, constraints, and timestamps:

-- Extension for UUID generation
CREATE EXTENSION IF NOT EXISTS "pgcrypto";

-- Organizations (tenants)
CREATE TABLE organizations (
    id          UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    name        VARCHAR(150) NOT NULL,
    slug        VARCHAR(80) NOT NULL UNIQUE,
    plan        VARCHAR(30) NOT NULL DEFAULT 'free',
    created_at  TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    updated_at  TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

-- Users
CREATE TABLE users (
    id              UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    organization_id UUID NOT NULL REFERENCES organizations(id) ON DELETE CASCADE,
    email           CITEXT NOT NULL,
    password_hash   TEXT NOT NULL,
    full_name       VARCHAR(150),
    role            VARCHAR(20) NOT NULL DEFAULT 'member',
    is_active       BOOLEAN NOT NULL DEFAULT TRUE,
    created_at      TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    updated_at      TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    UNIQUE (organization_id, email)
);

-- Projects
CREATE TABLE projects (
    id              UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    organization_id UUID NOT NULL REFERENCES organizations(id) ON DELETE CASCADE,
    name            VARCHAR(200) NOT NULL,
    description     TEXT,
    status          VARCHAR(20) NOT NULL DEFAULT 'active',
    created_by      UUID REFERENCES users(id) ON DELETE SET NULL,
    created_at      TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    updated_at      TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

-- Tasks
CREATE TABLE tasks (
    id           UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    project_id   UUID NOT NULL REFERENCES projects(id) ON DELETE CASCADE,
    assignee_id  UUID REFERENCES users(id) ON DELETE SET NULL,
    title        VARCHAR(255) NOT NULL,
    description  TEXT,
    status       VARCHAR(20) NOT NULL DEFAULT 'todo',
    priority     SMALLINT NOT NULL DEFAULT 3 CHECK (priority BETWEEN 1 AND 5),
    due_date     DATE,
    created_at   TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    updated_at   TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

-- Comments
CREATE TABLE comments (
    id         UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    task_id    UUID NOT NULL REFERENCES tasks(id) ON DELETE CASCADE,
    user_id    UUID NOT NULL REFERENCES users(id) ON DELETE CASCADE,
    body       TEXT NOT NULL,
    created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

-- Tags
CREATE TABLE tags (
    id              UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    organization_id UUID NOT NULL REFERENCES organizations(id) ON DELETE CASCADE,
    name            VARCHAR(50) NOT NULL,
    color           VARCHAR(7),
    UNIQUE (organization_id, name)
);

-- Junction table for tasks/tags (many-to-many)
CREATE TABLE task_tags (
    task_id UUID NOT NULL REFERENCES tasks(id) ON DELETE CASCADE,
    tag_id  UUID NOT NULL REFERENCES tags(id) ON DELETE CASCADE,
    PRIMARY KEY (task_id, tag_id)
);

Step 5: Indexing Strategy

Primary keys and unique constraints get indexes automatically. For everything else, index based on how the application actually queries the data.

-- Foreign key lookups (PostgreSQL does NOT auto-index FKs)
CREATE INDEX idx_users_organization_id   ON users(organization_id);
CREATE INDEX idx_projects_organization_id ON projects(organization_id);
CREATE INDEX idx_tasks_project_id        ON tasks(project_id);
CREATE INDEX idx_tasks_assignee_id       ON tasks(assignee_id);
CREATE INDEX idx_comments_task_id        ON comments(task_id);

-- Composite index for typical dashboard queries
CREATE INDEX idx_tasks_project_status ON tasks(project_id, status);

-- Partial index: only active tasks with a due date
CREATE INDEX idx_tasks_due_active ON tasks(due_date)
    WHERE status <> 'done' AND due_date IS NOT NULL;

Indexing rules of thumb:

  • Always index foreign key columns in PostgreSQL.
  • Add composite indexes for the most frequent query filters, in the order of selectivity.
  • Use partial indexes when only a subset of rows is ever queried.
  • Don’t over-index. Every index slows down writes.
database schema diagram

Common Pitfalls to Avoid

  1. Using integer auto-increment IDs in a distributed system. UUIDs avoid collisions and prevent enumeration attacks.
  2. Storing dates as strings. Use TIMESTAMPTZ for all timestamps to handle time zones correctly.
  3. Soft-delete columns everywhere. Only add deleted_at where you actually need to recover data.
  4. Putting JSON everywhere. JSONB is great for truly schemaless data, but if you query a field often, give it its own column.
  5. Forgetting cascade rules. Decide explicitly what happens to children when a parent is deleted.
  6. No tenant scoping. In multi-tenant apps, every query must filter by tenant. Consider row-level security in PostgreSQL.
  7. Skipping migrations from day one. Use a migration tool (Flyway, Liquibase, EF Core Migrations) from the very first table.

Bonus: Automatic updated_at with a Trigger

Keep your updated_at columns honest without relying on application code:

CREATE OR REPLACE FUNCTION set_updated_at()
RETURNS TRIGGER AS $$
BEGIN
    NEW.updated_at = NOW();
    RETURN NEW;
END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER trg_tasks_updated_at
BEFORE UPDATE ON tasks
FOR EACH ROW EXECUTE FUNCTION set_updated_at();

Tools That Help

  • dbdiagram.io or drawSQL for visual modeling.
  • pgAdmin or DBeaver for inspecting the live database.
  • EXPLAIN ANALYZE in PostgreSQL to validate your index strategy under real load.

FAQ

What is the difference between a database schema and a database?

A database is the container of all your data. A schema is the structural definition: the tables, columns, types, and relationships. In PostgreSQL, “schema” also refers to a namespace inside a database that groups tables together.

Should I always normalize to 3NF?

For most web applications, yes. 3NF gives you data integrity and flexibility. Only denormalize after you have measured a real performance bottleneck that can’t be solved with proper indexing or caching.

Should I use UUIDs or integer IDs as primary keys?

UUIDs are safer for distributed systems, public APIs, and multi-tenant apps. Integers are slightly faster and use less storage. For modern SaaS products, UUIDs (especially UUIDv7 with built-in ordering) are the better default.

How do I handle schema changes in production?

Always use a migration tool with versioned, reviewable migration scripts. Run migrations as part of your deployment pipeline, and test rollback scenarios. Never edit a production schema by hand.

Is PostgreSQL a good choice for a SaaS application?

Absolutely. PostgreSQL offers strong ACID compliance, advanced features (JSONB, full-text search, row-level security, partitioning), excellent performance, and a permissive open-source license. It’s our default choice at Coding4 for new SaaS projects.

Wrapping Up

Knowing how to design a database schema is one of the highest-leverage skills in software engineering. The schema you ship in week one will shape every feature, query, and migration for years to come. Start with clear entities, normalize sensibly, enforce integrity at the database level, and index based on real query patterns rather than guesses.

Need help designing or auditing a database schema for your web application? Get in touch with our team at Coding4. We’ve designed schemas for SaaS products handling millions of records and we’d be glad to help you avoid the pitfalls.

Leave a Comment

Your email address will not be published. Required fields are marked *