Skip to content

Analytics Service

Architecture Overview

Our analytics service is designed to handle large-scale payloads with high performance and reliability. The system implements the Materialization Pattern to optimize for both write throughput and read performance and Eventual Consistency to prioritize system availability over strong consistency guarantees.

Database Architecture

Core Tables

  • pages: Stores page metadata with unique page names
  • page_views: Stores raw analytics data with timestamps and view counts
  • page_views_report: Materialized view that joins pages with analytics data

Materialization Pattern Implementation

1. Write-Optimized Tables

The base tables (pages and page_views) are designed for high-frequency writes:

  • pages: Simple lookup table for page metadata
  • page_views: Optimized for bulk inserts with composite index on (name, date, hour)
  • No complex joins or aggregations during writes

2. Materialized View for Read Optimization

The page_views_report materialized view implements the materialization pattern:

  • Pre-computed Results: Joins and aggregations are pre-calculated
  • Optimized Indexes: Composite indexes on common query patterns
  • Fast Read Access: Eliminates runtime joins for analytics queries
  • Consistent Structure: Predictable data format for reporting

Service Architecture

1. Increments Service

Purpose: Handles real-time analytics data ingestion

Responsibilities:

  • Accepts page view increment requests
  • Updates pages table (upsert page metadata)
  • Updates page_views table (increment view counts)
  • Optimized for high-frequency, low-latency writes

Key Features:

  • Write-only access to base tables
  • Minimal processing overhead
  • Designed for horizontal scaling

2. The Wolf Service

Purpose: Maintains the materialized view for optimal read performance

Responsibilities:

  • Periodically refreshes page_views_report materialized view
  • Ensures data consistency between tables and view
  • Handles view optimization and maintenance

Key Features:

  • Scheduled refresh every X minutes
  • Background processing to avoid blocking writes
  • Maintains read performance for analytics queries

3. Analytics Service

Purpose: Provides analytics data through predefined filters

Responsibilities:

  • Queries only the page_views_report materialized view
  • Provides filtered analytics data
  • Handles complex reporting queries

Key Features:

  • Read-only access to optimized materialized view
  • Fast response times for analytics queries

Materialization Pattern Benefits

Write Performance

  • Separation of Concerns: Write operations only touch optimized base tables
  • Minimal Locking: No complex joins during writes

Read Performance

  • Pre-computed Joins: Materialized view eliminates runtime joins
  • Optimized Indexes: Composite indexes on common query patterns
  • Consistent Structure: Predictable data format for analytics
  • Cached Results: Materialized view acts as a read cache

Scalability

  • Service Isolation: Each service can scale independently
  • Database Optimization: Write-optimized tables + read-optimized views
  • Background Processing: View updates don't block user operations
  • Horizontal Distribution: Services can be deployed across multiple instances

Trade-off Decision

  • Availability First: System remains available even during materialized view updates
  • Stale Data Tolerance: Analytics queries may return slightly outdated data (up to X minutes old)
  • No Blocking Operations: Write operations never wait for view synchronization
  • Background Synchronization: The Wolf service updates the materialized view asynchronously

Benefits of Eventual Consistency

  • High Availability: System never goes down due to view refresh operations
  • Better Performance: No blocking operations during writes or reads
  • Horizontal Scaling: Services can operate independently without coordination
  • Fault Tolerance: Partial failures don't cascade across the entire system