Structure Data in Firestore | Tutorial

TL;DR

To structure data in Firestore effectively, design your schema based on how your app queries data rather than how the data relates logically. Use root-level collections for data that is queried independently, subcollections for data scoped to a parent document, and denormalization (duplicating data) to avoid expensive joins. Firestore has no server-side joins, so embed frequently accessed related data directly in documents and keep rarely accessed details in separate collections.

Designing Data Structures for Cloud Firestore

Firestore is a NoSQL document database where data modeling follows different rules than relational databases. There are no JOINs, no foreign keys, and no normalization requirements. Instead, you design your schema around your app's queries: what screens need what data, how often each piece is read, and which documents are fetched together. This tutorial covers the core data modeling patterns with practical examples for common app scenarios.

Prerequisites

A Firebase project with Firestore database created
Firebase SDK installed (npm install firebase)
Familiarity with Firestore documents and collections (setDoc, getDoc, getDocs)
Understanding of your app's main screens and data access patterns

Step-by-step guide

Understand the document-collection model

Firestore organizes data in documents and collections. A document is a set of key-value pairs (up to 1 MiB). A collection is a group of documents. Documents can contain subcollections, which are collections nested inside a document. The path to a document alternates between collections and documents: collection/document/subcollection/subdocument. Design your hierarchy based on query patterns, not data relationships.

typescript

1// Root collection: users
2// Path: users/{userId}
3// Subcollection: users/{userId}/orders
4// Path: users/{userId}/orders/{orderId}
5
6import { collection, doc, setDoc, serverTimestamp } from 'firebase/firestore';
7import { db } from './firebase';
8
9// Root-level document
10await setDoc(doc(db, 'users', 'user-1'), {
11  displayName: 'Alice',
12  email: 'alice@example.com',
13  role: 'member',
14  createdAt: serverTimestamp()
15});
16
17// Subcollection document
18await setDoc(doc(db, 'users', 'user-1', 'orders', 'order-1'), {
19  total: 49.99,
20  status: 'shipped',
21  createdAt: serverTimestamp()
22});

Expected result: A user document exists at users/user-1 and an order document exists at users/user-1/orders/order-1 as separate entities.

Choose root collections for independently queried data

Use root-level collections when you need to query data across all users or when the data has its own lifecycle. For example, if your app has a global feed of posts from all users, posts should be a root collection — not a subcollection under each user. Root collections make cross-user queries straightforward with simple where() clauses.

typescript

1// Root collection for posts (queryable across all users)
2await setDoc(doc(db, 'posts', 'post-1'), {
3  title: 'My first post',
4  content: 'Hello world',
5  authorId: 'user-1',
6  authorName: 'Alice',  // Denormalized for display
7  tags: ['firebase', 'tutorial'],
8  likes: 42,
9  createdAt: serverTimestamp()
10});
11
12// Now you can query all posts, filter by tag, sort by likes, etc.
13// without knowing which user wrote them

Expected result: Posts are stored in a root collection where they can be queried globally with where, orderBy, and compound filters.

Use subcollections for parent-scoped data

Use subcollections when the data is always accessed in the context of a parent document and does not need cross-parent queries. Examples include a user's private settings, an order's line items, or a chat room's messages. Subcollections have their own indexes and do not count toward the parent document's 1 MiB size limit.

typescript

1// Chat messages as a subcollection of chatRooms
2// Path: chatRooms/{roomId}/messages/{messageId}
3
4import { addDoc, collection, query, orderBy, limit, getDocs } from 'firebase/firestore';
5
6// Add a message to a room
7await addDoc(collection(db, 'chatRooms', 'room-1', 'messages'), {
8  text: 'Hello everyone!',
9  senderId: 'user-1',
10  senderName: 'Alice',
11  createdAt: serverTimestamp()
12});
13
14// Query messages for a specific room
15const q = query(
16  collection(db, 'chatRooms', 'room-1', 'messages'),
17  orderBy('createdAt', 'desc'),
18  limit(50)
19);
20const snapshot = await getDocs(q);

Expected result: Messages are organized under their parent chat room and queried efficiently without loading messages from other rooms.

Denormalize data to avoid multiple reads

Firestore has no server-side JOINs. If displaying a post requires data from both the posts and users collections, you would need two reads per post — which quickly becomes expensive and slow. Instead, denormalize by copying frequently displayed fields (like authorName and authorPhotoURL) directly onto the post document. The trade-off is that updates to the source data must be propagated to all copies, typically via a Cloud Function.

typescript

1// Denormalized post document — contains author info directly
2await setDoc(doc(db, 'posts', 'post-1'), {
3  title: 'Building with Firebase',
4  content: 'A comprehensive guide...',
5  authorId: 'user-1',
6  authorName: 'Alice Johnson',       // Denormalized from users collection
7  authorPhotoURL: 'https://...',     // Denormalized from users collection
8  categoryId: 'tutorials',
9  categoryName: 'Tutorials',          // Denormalized from categories
10  commentCount: 12,                   // Denormalized count
11  lastCommentAt: serverTimestamp(),   // Denormalized timestamp
12  createdAt: serverTimestamp()
13});
14
15// Now displaying a list of posts requires only ONE query
16// No need to fetch author data separately

Expected result: Post documents contain all the data needed to render a post card without additional reads to other collections.

Use collection group queries for cross-parent lookups

Collection group queries let you query all subcollections with the same name across all parent documents. This is useful when you need to find data across parents — like all orders with status 'pending' regardless of which user placed them. You need to create a collection group index for the queried field.

typescript

1import { collectionGroup, query, where, getDocs } from 'firebase/firestore';
2
3// Find all pending orders across ALL users
4const pendingOrders = query(
5  collectionGroup(db, 'orders'),
6  where('status', '==', 'pending')
7);
8
9const snapshot = await getDocs(pendingOrders);
10snapshot.forEach((doc) => {
11  console.log(doc.ref.path); // users/user-1/orders/order-3
12  console.log(doc.data());
13});

Expected result: The query returns all order documents with status 'pending' from every user's orders subcollection.

Choose the right pattern for common scenarios

Different app scenarios call for different data structures. Here is a decision guide for common patterns: use root collections for global feeds and searchable data, subcollections for user-scoped or parent-scoped data, maps (embedded objects) for small fixed structures that are always read with the parent, and arrays for small lists of primitive values like tags. Avoid deeply nested subcollections (more than 2 levels) as they complicate queries and security rules.

typescript

1// Pattern: E-commerce product with reviews
2// Products = root collection (global catalog)
3// Reviews = subcollection under each product
4// User profile = root collection
5
6// Product document with denormalized review summary
7await setDoc(doc(db, 'products', 'prod-1'), {
8  name: 'Wireless Headphones',
9  price: 79.99,
10  category: 'electronics',
11  // Denormalized review summary (updated via Cloud Function)
12  reviewCount: 234,
13  averageRating: 4.6,
14  // Embedded map for small fixed data
15  dimensions: {
16    weight: '250g',
17    width: '18cm',
18    height: '20cm'
19  },
20  // Array for small list of primitives
21  tags: ['wireless', 'bluetooth', 'noise-canceling'],
22  createdAt: serverTimestamp()
23});
24
25// Individual reviews in subcollection
26await addDoc(collection(db, 'products', 'prod-1', 'reviews'), {
27  userId: 'user-1',
28  userName: 'Alice',
29  rating: 5,
30  text: 'Great sound quality!',
31  createdAt: serverTimestamp()
32});

Expected result: Products are queryable globally with embedded summary data, while reviews are organized per product in subcollections.

Complete working example

firestore-schema.ts

1// Firestore data structure patterns
2// Demonstrates root collections, subcollections, denormalization,
3// and collection group queries
4
5import { initializeApp } from 'firebase/app';
6import {
7  getFirestore,
8  doc,
9  setDoc,
10  addDoc,
11  collection,
12  collectionGroup,
13  query,
14  where,
15  orderBy,
16  limit,
17  getDocs,
18  serverTimestamp
19} from 'firebase/firestore';
20
21const app = initializeApp({
22  apiKey: 'YOUR_API_KEY',
23  authDomain: 'YOUR_PROJECT.firebaseapp.com',
24  projectId: 'YOUR_PROJECT_ID'
25});
26const db = getFirestore(app);
27
28// 1. Root collection for users (globally queryable)
29export async function createUser(uid: string, data: {
30  displayName: string;
31  email: string;
32}) {
33  await setDoc(doc(db, 'users', uid), {
34    ...data,
35    role: 'member',
36    createdAt: serverTimestamp()
37  });
38}
39
40// 2. Root collection for posts (queryable feed)
41export async function createPost(authorId: string, data: {
42  title: string;
43  content: string;
44  authorName: string; // denormalized
45  tags: string[];
46}) {
47  return addDoc(collection(db, 'posts'), {
48    ...data,
49    authorId,
50    likes: 0,
51    commentCount: 0,
52    createdAt: serverTimestamp()
53  });
54}
55
56// 3. Subcollection for comments (scoped to parent post)
57export async function addComment(postId: string, data: {
58  userId: string;
59  userName: string;
60  text: string;
61}) {
62  return addDoc(collection(db, 'posts', postId, 'comments'), {
63    ...data,
64    createdAt: serverTimestamp()
65  });
66}
67
68// 4. Collection group query (across all parents)
69export async function findRecentCommentsByUser(userId: string) {
70  const q = query(
71    collectionGroup(db, 'comments'),
72    where('userId', '==', userId),
73    orderBy('createdAt', 'desc'),
74    limit(20)
75  );
76  const snap = await getDocs(q);
77  return snap.docs.map((d) => ({ id: d.id, path: d.ref.path, ...d.data() }));
78}
79
80// 5. Query root collection with denormalized data
81export async function getFeedPosts(tag?: string) {
82  let q = query(
83    collection(db, 'posts'),
84    orderBy('createdAt', 'desc'),
85    limit(25)
86  );
87  if (tag) {
88    q = query(q, where('tags', 'array-contains', tag));
89  }
90  const snap = await getDocs(q);
91  return snap.docs.map((d) => ({ id: d.id, ...d.data() }));
92}

Common mistakes when structuring Data in Firestore

Why it's a problem: Designing Firestore schema like a relational database with normalized tables and expecting JOINs to work

How to avoid: Firestore has no server-side JOINs. Denormalize data by embedding frequently accessed fields directly on the document that displays them.

Why it's a problem: Storing large unbounded lists as array fields on a document, causing the document to grow beyond the 1 MiB limit

How to avoid: Use subcollections for lists that can grow indefinitely (comments, messages, orders). Reserve array fields for small bounded lists like tags or categories.

Why it's a problem: Nesting subcollections more than 2 levels deep, making paths unwieldy and security rules complex

How to avoid: Flatten your hierarchy by using root collections with reference IDs. For example, use a root comments collection with a postId field instead of posts/{postId}/comments/{commentId}/replies/{replyId}.

Why it's a problem: Not denormalizing data that is displayed together, causing N+1 read problems on list screens

How to avoid: For every list screen in your app, ensure the document being listed contains all the data needed to render one list item without additional reads.

Best practices

Design your schema around your app's queries, not around data relationships
Use root collections for data that is queried independently across users or contexts
Use subcollections for data that is always accessed in the context of a parent document
Denormalize frequently displayed data to avoid multiple reads per list item
Use Cloud Functions to keep denormalized data in sync when the source changes
Use collection group queries when you need to query subcollections across all parents
Keep documents under 100 KB for optimal read performance even though the limit is 1 MiB
Prefer flat data structures over deeply nested subcollections for simpler security rules

Still stuck?

Copy one of these prompts to get a personalized, step-by-step explanation.

ChatGPT Prompt

I am building an app with Firebase Firestore and need help structuring the data. Show me how to decide between root collections, subcollections, and embedded data. Include examples of denormalization and collection group queries for a social media app with users, posts, and comments.

Firebase Prompt

Design a Firestore data structure for a social media app using the Firebase modular SDK v9+. Include root collections for users and posts, subcollections for comments, denormalized author info on posts, collection group queries for cross-post comment lookups, and TypeScript interfaces for each document type.

Frequently asked questions

Should I use subcollections or root collections for my data?

Use subcollections when data is always accessed through a parent (e.g., messages in a chat room). Use root collections when data needs cross-parent queries (e.g., a global feed of posts from all users).

How do I handle data that needs to be queried both ways?

Use a root collection with a parentId field. For example, store comments in a root comments collection with a postId field. This allows both per-post queries (where postId == X) and cross-post queries (where userId == Y).

Is denormalization really necessary in Firestore?

Yes, for most production apps. Without denormalization, displaying a list of posts with author names requires 1 read per post plus 1 read per unique author. With denormalization, it requires only 1 read per post.

How do I keep denormalized data in sync?

Use Cloud Functions triggered by document updates. When a user changes their displayName, a Firestore trigger updates all posts and comments that contain the old name. Batch writes handle up to 500 documents per batch.

What is the maximum nesting depth for subcollections?

Firestore supports up to 100 levels of subcollection nesting. However, practical applications rarely go beyond 2 levels. Deep nesting makes paths long, security rules complex, and backups difficult.

Can I move data between collections after launch?

Firestore has no built-in migration tool. You need to write a script that reads from the old location, writes to the new location, and optionally deletes the old data. Plan your schema carefully before launch.

Can RapidDev help design an efficient Firestore schema for my app?

Yes. RapidDev can analyze your app requirements and design a Firestore data model optimized for query performance, cost efficiency, and scalability using proven denormalization and structuring patterns.

Talk to an Expert

Our team has built 600+ apps. Get personalized help with your project.

Book a free consultation

How to Structure Data in Firestore

What you'll learn

What you'll learn

Designing Data Structures for Cloud Firestore

Prerequisites

Step-by-step guide

Understand the document-collection model

Understand the document-collection model

Choose root collections for independently queried data

Choose root collections for independently queried data

Use subcollections for parent-scoped data

Use subcollections for parent-scoped data

Denormalize data to avoid multiple reads

Denormalize data to avoid multiple reads

Use collection group queries for cross-parent lookups

Use collection group queries for cross-parent lookups

Choose the right pattern for common scenarios

Choose the right pattern for common scenarios

Complete working example

Common mistakes when structuring Data in Firestore

Best practices

Still stuck?

Frequently asked questions

Talk to an Expert

Need help with your project?

We put the rapid in RapidDev

How to Structure Data in Firestore

What you'll learn

Designing Data Structures for Cloud Firestore

Prerequisites

Step-by-step guide

Understand the document-collection model

Understand the document-collection model

Choose root collections for independently queried data

Choose root collections for independently queried data

Use subcollections for parent-scoped data

Use subcollections for parent-scoped data

Denormalize data to avoid multiple reads

Denormalize data to avoid multiple reads

Use collection group queries for cross-parent lookups

Use collection group queries for cross-parent lookups

Choose the right pattern for common scenarios

Choose the right pattern for common scenarios

Complete working example

Common mistakes when structuring Data in Firestore

Best practices

Still stuck?

Related tutorials

How to model relationships in Firestore

How to store arrays in Firestore

How to reduce Firestore read costs

Frequently asked questions

Talk to an Expert

Need help with your project?

We put the rapid in RapidDev