Last updated
Analyzing a Slow Query for Index Recommendations
The Index Advisor analyzes SQL queries and recommends optimal indexes. Here is a typical analysis:
/* Slow query — takes 3.2 seconds on 10M row table */
SELECT u.name, u.email, o.total, o.created_at
FROM users u
JOIN orders o ON u.id = o.user_id
WHERE o.status = 'pending'
AND o.created_at > '2024-01-01'
ORDER BY o.created_at DESC
LIMIT 50;
/* Query execution plan (without indexes) */
EXPLAIN SELECT ...
→ Full table scan on orders (10,234,567 rows examined)
→ Full table scan on users (2,456,789 rows examined)
→ Execution time: 3.2 seconds
/* Index Advisor recommendations */
Recommendation 1 (HIGH PRIORITY):
CREATE INDEX idx_orders_status_created
ON orders (status, created_at DESC);
Reason: Covers WHERE status = 'pending' AND created_at > ...
AND ORDER BY created_at DESC
Estimated improvement: 3.2s → 0.008s (400× faster)
Recommendation 2 (MEDIUM PRIORITY):
CREATE INDEX idx_orders_user_id
ON orders (user_id);
Reason: Speeds up JOIN with users table
Estimated improvement: Reduces join cost by 95%
/* After adding indexes */
EXPLAIN SELECT ...
→ Index range scan on idx_orders_status_created (42 rows examined)
→ Index lookup on users via primary key (42 lookups)
→ Execution time: 0.008 seconds
Composite Index Column Ordering
The order of columns in a composite index matters significantly for performance:
/* Query pattern */
SELECT * FROM orders
WHERE user_id = 123
AND status = 'shipped'
AND created_at BETWEEN '2024-01-01' AND '2024-12-31';
/* Index option A: (user_id, status, created_at) */
CREATE INDEX idx_a ON orders (user_id, status, created_at);
-- ✓ Optimal: equality columns first, range column last
-- Uses all 3 columns for filtering
/* Index option B: (created_at, user_id, status) */
CREATE INDEX idx_b ON orders (created_at, status, user_id);
-- ✗ Suboptimal: range column first
-- Only uses created_at for index scan, then filters rest
/* Index option C: (status, user_id, created_at) */
CREATE INDEX idx_c ON orders (status, user_id, created_at);
-- ✓ Also good: both equality columns before range column
/* Rule: equality conditions first, range conditions last */
/* Most selective column first (highest cardinality) */
/* Cardinality check */
SELECT
COUNT(DISTINCT user_id) AS user_cardinality, -- 2,456,789
COUNT(DISTINCT status) AS status_cardinality, -- 5
COUNT(*) AS total_rows -- 10,234,567
FROM orders;
/* user_id has highest cardinality → put first */
/* Best index: (user_id, status, created_at) */
Covering Index Recommendations
A covering index includes all columns needed by a query, eliminating table lookups:
/* Query that benefits from covering index */
SELECT user_id, status, total, created_at
FROM orders
WHERE status = 'pending'
ORDER BY created_at DESC;
/* Regular index — still needs table lookup */
CREATE INDEX idx_status ON orders (status);
-- Index finds matching rows, then fetches user_id, total, created_at from table
-- Two operations: index scan + table row fetch
/* Covering index — satisfies query entirely from index */
CREATE INDEX idx_status_covering
ON orders (status, created_at DESC, user_id, total);
-- All needed columns are IN the index
-- No table lookup needed — much faster
/* EXPLAIN output comparison */
-- Without covering index:
-- type: ref, Extra: Using index condition; Using filesort
-- With covering index:
-- type: ref, Extra: Using index ← "Using index" = covering index used
/* When to use covering indexes */
✓ Frequently executed queries on large tables
✓ Queries that select only a few columns
✓ Reporting queries that run on read replicas
✗ Avoid for tables with heavy INSERT/UPDATE (index maintenance cost)
Identifying and Removing Redundant Indexes
Redundant indexes waste storage and slow down writes without improving reads:
/* Redundant index detection */
Existing indexes on orders table:
idx_1: (user_id)
idx_2: (user_id, status) ← makes idx_1 redundant
idx_3: (user_id, status, created_at) ← makes idx_2 redundant
idx_4: (status)
idx_5: (status, created_at) ← makes idx_4 redundant
/* Analysis */
idx_1 (user_id) is redundant:
idx_2 (user_id, status) can serve all queries that use idx_1
Queries filtering only on user_id can use idx_2 (leftmost prefix)
→ DROP INDEX idx_1
idx_2 (user_id, status) is redundant:
idx_3 (user_id, status, created_at) covers all idx_2 use cases
→ DROP INDEX idx_2
idx_4 (status) is redundant:
idx_5 (status, created_at) covers all idx_4 use cases
→ DROP INDEX idx_4
/* After cleanup */
Remaining indexes:
idx_3: (user_id, status, created_at)
idx_5: (status, created_at)
/* Impact of removing redundant indexes */
Write performance: +15% (fewer indexes to maintain on INSERT/UPDATE)
Storage savings: ~2.4 GB (3 indexes removed from 10M row table)
Read performance: Unchanged (remaining indexes cover all queries)
Partial Index for Filtered Queries
Partial indexes only index rows matching a condition, making them smaller and faster:
/* Scenario: 95% of orders are 'completed', only 5% are 'pending' */
/* Queries almost always filter on status = 'pending' */
SELECT * FROM orders WHERE status = 'pending' ORDER BY created_at;
/* Full index — indexes all 10M rows */
CREATE INDEX idx_full ON orders (status, created_at);
-- Index size: ~800 MB
/* Partial index — only indexes 'pending' rows (500K rows) */
CREATE INDEX idx_pending ON orders (created_at)
WHERE status = 'pending';
-- Index size: ~40 MB (95% smaller!)
-- Faster: smaller index fits in memory
/* PostgreSQL partial index syntax */
CREATE INDEX idx_active_users ON users (email)
WHERE is_active = true;
-- Only indexes active users (avoids indexing deleted/inactive accounts)
/* MySQL doesn't support partial indexes directly */
/* Workaround: use a generated column */
ALTER TABLE orders ADD COLUMN is_pending TINYINT
GENERATED ALWAYS AS (IF(status = 'pending', 1, NULL)) STORED;
CREATE INDEX idx_pending ON orders (is_pending, created_at);
-- NULL values are not indexed → effectively a partial index