Skip to main content

Indexes & Performance

A database starts fast. A few hundred documents in, every query takes a millisecond. A few hundred thousand documents in, the same query starts to feel sticky. A few million documents in, without care, it grinds. The line between "fast forever" and "slow at scale" is indexes.

This page explains, in plain English, what an index is, why Dashify uses them everywhere, and how the platform keeps an eye on query performance.

What an index is

Imagine a phone book. If you want to find every person whose last name is "Khan", you can do it two ways.

The slow way: open the book on page one and read every entry, top to bottom, all the way to the end. Even if the book has 10,000 entries, you check all 10,000.

The fast way: the book is alphabetised. Flip to "K", scan a few pages, find the Khans, done. You looked at maybe 30 entries.

The alphabetisation is the index. Without it, the database does the equivalent of reading the whole phone book on every query.

In MongoDB, an index is a special data structure, a tree, internally, that lets the database jump straight to the documents matching a particular field's value, without scanning the rest of the collection.

Indexes in Dashify

Every collection in Dashify has at least one index, and most have several. The patterns are predictable.

Tenant id is always indexed. Every collection that is tenant scoped has tenantId indexed. Because every query automatically filters by tenantId (thanks to the plugin), this index is touched by literally every read. It is the single most important index in the platform.

Foreign keys are indexed. A query like "show me all work items in this project" filters by projectId. An index on projectId (compound with tenantId) makes that query O(log n) instead of O(n).

Soft-delete fields are indexed. Most queries also filter out soft deleted documents (deletedAt: null). Soft deletes plus tenant id usually live in a compound index, one index that covers both fields together for maximum speed.

Search fields are indexed. The knowledge base has a weighted text index across title, summary, and body, so full-text search across articles is sub-second.

Audit logs are indexed by actor and by tenant. "Show me everything user X did in the last 24 hours" must be fast even when the audit log has tens of millions of rows.

Compound indexes

A compound index covers multiple fields in a specific order. Because Dashify queries almost always filter by tenant id first and then by something else, compound indexes typically look like { tenantId: 1, otherField: 1 }.

Order matters. An index on (tenantId, projectId) answers a query for "this tenant's items in project X" lightning-fast. An index on (projectId, tenantId) does not answer the same query as efficiently. Mongoose makes the order explicit and code review enforces "tenant first."

Where the indexes live

Mongoose schemas declare indexes alongside their fields. When the application starts up, Mongoose ensures every declared index exists on the corresponding collection. New indexes are built in the background, MongoDB does not block writes while indexing.

In production a deployment never blocks on index creation. If a new index is added in a release, MongoDB builds it in the background; queries are slightly slower until the build finishes, but writes never wait.

Watching for slow queries

Indexes are only useful if you actually use them. Dashify watches for queries that are not using indexes through three mechanisms.

Query profiler middleware. A small piece of middleware on the API server measures how long each MongoDB query takes. Anything over a configurable threshold (50 ms in production) is logged with the full query and which index it used (or didn't).

Prometheus metrics. A counter increments every time a slow query is logged. Grafana shows it on a dashboard. If slow queries spike, oncall sees it.

MongoDB's own profiler. In production, MongoDB itself can be configured to log every query above a duration threshold. Atlas and most managed providers expose this in their dashboard.

When a slow query is found, the fix is almost always: add an index that covers it.

The other half, caching

Not every fast read comes from MongoDB. Some come from Redis (the next page covers it in detail). For lookups that happen on every request, package definitions, tenant settings, role permission maps, the API caches the result in Redis with a short TTL. The first request hits MongoDB; the next thousand hit Redis.

The caches are invalidated when the underlying document changes. Caches that get out of sync with the source of truth are worse than no cache, so Dashify is conservative about what it caches.

Pagination is mandatory

Every list endpoint paginates. There is no API in Dashify that returns "all of something." A request for "show me all work items" returns the first 50 with a cursor for the next page. This is for two reasons: it keeps response sizes bounded so the network is fast, and it keeps memory usage on the server bounded so a malicious request cannot exhaust RAM.

Bulk operations

When a write touches many documents, say, soft deleting every work item in a deleted project, Dashify uses bulk operations rather than one query per document. A single bulk update of 10,000 documents takes a fraction of the time of 10,000 individual updates.

What you might add

If you adopt Dashify and your tenants grow large, here are the things to keep an eye on:

  • The audit log grows fastest of any collection. Consider a TTL index that automatically expires audit rows older than the retention window.
  • Real-time chat messages also grow quickly. The model already includes time-to-live policy hooks; flip them on once you decide on a retention rule.
  • The queue (BullMQ in Redis) cleans up completed jobs automatically, but failed jobs hang around for inspection. Periodically prune them.

Key takeaways

  • An index is what makes a database query fast at scale, without one, every query reads the whole collection.
  • Every Dashify collection has tenantId indexed, almost always as part of a compound index.
  • A query profiler middleware logs slow queries; Prometheus + Grafana make them visible.
  • Hot lookups are also cached in Redis.
  • Every list endpoint paginates by design, no endpoint returns "all of something."