【Database】

Don’t add indexes on low-cardinality fields (like status, enums) < 100 distinct values

Even when the index is used, it’s not efficient because”.
- For a field like status: ['pending', 'completed'], each value appears in ~50% of rows
- The DB still needs to scan half the table, making the index practically useless
- You’re paying the cost of index maintenance for no real benefit
When querying with status + other fields (like id), the query planner might wrongly choose the status index over more selective indexes

Always review migrations before deployment, handle large tables manually

tables with millions of records can lock for significant periods during migrations
- Lock duration
  - 1M rows: seconds of lock
  - 10M rows: minutes of lock
  - 100M+ rows: potential hours of lock
Use zero-downtime migration strategies
- 1: Add column without default
- 2: Backfill data in batches
- 3: Add index concurrently (PostgreSQL)

In Rails, beware of before_action at the top of the controller as transactions start at the first query.

Problem:

class OrdersController < ApplicationController
before_action :authenticate_user! # Starts transaction here with user query
before_action :fetch_external_data # Long HTTP call while transaction is open
  
def create
  @order = Order.create!(params) # DB operation inside transaction
end
end

Better Approach:

class OrdersController < ApplicationController
skip_before_action :authenticate_user!
def create
  # Do HTTP calls first
  external_data = fetch_external_data
  # Then start transaction with auth
  authenticate_user!
  # DB operations last, keeping transaction short
  @order = Order.create!(params.merge(external_data: external_data))
end
end

Missing unique constraint in database will cause a data integrity issue

database-level unique constraint
- ```
add_index :users, :email, unique: true
```

Model-level unique Validation is not enough

validates :email, presence: true, uniqueness: true # This is not enough

model validation flow:

def validate_uniqueness
  # 2 processes checking the same thing 
  # if email exists in the database
  existing_record = User.where(email: email).exists?

  # 2 processes both get the same result, there is no email
  # 2 processes both create user with the same email
  if existing_record.nil?
    # create
    User.create!(email: email)
  end
end

conclusion:
- Race condition in high concurrency
- Duplication of data
- Fixation of data is complex and risky

【Async Jobs】

Sidekiq

Idempotency! At least ensure database operations are idempotent
- external API calls are not idempotent, should be placedat the end of the process
- when facing OOMkilled, k8s allow sidekiq 10 seconds to finish the job, make sure that Sidekiq can finish the job within 10 seconds
Make sure retry does not send too much error emails
Use defensive code to avoid errors
- data can be corrupted in the middle of the process
- the data is already corrupted (e.g. user already deleted softly by gem paranoid).

【Business】

Never trust IDs from frontend, always verify

ownership/authorization
- never use Order.find(params[:id]), instead use current_user.orders.find(params[:id])
security
- Use UUID for IDs, so that hackers can’t guess the next ID

【Security】

Never hardcode secrets in codebase (from SMS template_ids to JWT secrets)

prefer environment variables, database
use Rails.application.credentials

【Cache】

Don’t abuse Rails.cache methods. For Redis keys over 1KB, implement two-level caching

class DashboardCache
  def self.fetch_for_user(user)
    # Level 1: Store only IDs/metadata in Redis
    cache_keys = Rails.cache.fetch("dashboard_keys_#{user.id}", expires_in: 1.hour) do
      {
      stats_key: "stats_#{user.id}#{Time.current.to_i}",
      reports_key: "reports_#{user.id}#{Time.current.to_i}"
      }
    end

    # Level 2: Store actual data in separate keys
    stats = Rails.cache.fetch(cache_keys[:stats_key], expires_in: 1.hour) do
      DashboardService.generate_stats(user)
    end

    reports = Rails.cache.fetch(cache_keys[:reports_key], expires_in: 1.hour) do
      DashboardService.generate_reports(user)
    end
      
    { stats: stats, reports: reports }
  end
end

【Code】

Before development: create branch, write design doc, get review, then code

【Performance】

Most performance issues come from database or external dependencies
- check N+1 queries first
- implement monitoring
Memory
- QRCode, Image generation, Excel, PDF generation
- heavy dependencies in Dockerfile, ImageMagick, wkhtmltopdf etc
  - Docker image size
  - memory consumption at runtime
    - PDF >= 300MB
    - Image >= 100MB
    - Excel >= 100MB
- use serverless AWS Lambdato handle
  - response = HTTP.post( ENV['PDF_GENERATOR_URL'], json: { template: 'reports/show', data: @report.data } )