Database Transaction Monitor
Overview
Monitor active database transactions in real time to detect long-running queries, lock contention, uncommitted transactions, and transaction throughput anomalies across PostgreSQL, MySQL, and MongoDB.
Prerequisites
-
Database credentials with access to system catalogs (pg_stat_activity , information_schema.PROCESSLIST , or MongoDB currentOp )
-
psql , mysql , or mongosh CLI installed
-
Permissions to view other sessions' transactions (PostgreSQL: pg_monitor role; MySQL: PROCESS privilege)
-
Baseline metrics for normal transaction duration and throughput
-
Alerting infrastructure (email, Slack webhook, or PagerDuty) for notifications
Instructions
Query the active transaction view to establish a baseline. For PostgreSQL: SELECT pid, state, query_start, now() - query_start AS duration, query FROM pg_stat_activity WHERE state != 'idle' ORDER BY duration DESC . For MySQL: SELECT id, user, host, db, command, time, state, info FROM information_schema.PROCESSLIST WHERE command != 'Sleep' .
Identify long-running transactions by filtering for duration exceeding the application's expected transaction time. Set initial thresholds at 30 seconds for OLTP workloads or 5 minutes for batch/reporting workloads.
Detect idle-in-transaction sessions that hold locks without executing queries. For PostgreSQL: SELECT pid, state, query_start, now() - state_change AS idle_duration FROM pg_stat_activity WHERE state = 'idle in transaction' AND now() - state_change > interval '5 minutes' .
Monitor lock contention by querying the lock manager. For PostgreSQL: SELECT blocked_locks.pid AS blocked_pid, blocking_locks.pid AS blocking_pid, blocked_activity.query AS blocked_query FROM pg_catalog.pg_locks blocked_locks JOIN pg_catalog.pg_locks blocking_locks ON blocking_locks.locktype = blocked_locks.locktype . For MySQL: SELECT * FROM information_schema.INNODB_LOCK_WAITS .
Track transaction throughput by sampling pg_stat_database (xact_commit, xact_rollback) or MySQL Com_commit / Com_rollback status variables at regular intervals. Calculate commits/second and rollback ratio.
Create monitoring scripts that run on a cron schedule (every 30-60 seconds) to capture transaction metrics and write to a time-series store or log file.
Configure alerting thresholds: transactions exceeding 60 seconds, idle-in-transaction sessions exceeding 5 minutes, lock wait queues exceeding 10 waiters, and rollback ratio exceeding 5%.
Build a transaction summary dashboard query that shows: active transaction count, average duration, longest running transaction, lock wait count, and commits-per-second over the last hour.
Implement automatic remediation for known-safe scenarios: terminate idle-in-transaction sessions older than 30 minutes using SELECT pg_terminate_backend(pid) (PostgreSQL) or KILL connection_id (MySQL), with logging of terminated sessions.
Generate weekly transaction health reports summarizing peak transaction counts, P95/P99 duration percentiles, deadlock occurrences, and long-running transaction incidents.
Output
-
Transaction monitoring queries tailored to the specific database engine in use
-
Monitoring scripts (shell or Python) for scheduled transaction health checks
-
Alert configuration with threshold definitions and notification channel setup
-
Dashboard queries showing transaction throughput, duration distribution, and lock metrics
-
Weekly health report template with transaction performance trends and anomaly highlights
Error Handling
Error Cause Solution
pg_stat_activity returns no rows for other sessions Missing pg_monitor role or track_activities disabled Grant pg_monitor role; set track_activities = on in postgresql.conf
Lock monitoring query times out Massive lock table during contention storm Query pg_locks with a statement_timeout; reduce monitoring frequency during incidents
False positive alerts for long-running transactions Batch jobs or maintenance operations trigger duration alerts Create an exclusion list for known batch job PIDs or application users; use separate thresholds for batch vs OLTP
Transaction throughput drops to zero Connection pool exhaustion or database crash Check max_connections usage; verify database process is running; check for full disk or OOM conditions
Monitoring queries add overhead High-frequency polling of system catalogs Reduce polling interval to every 60 seconds; use pg_stat_statements for aggregated stats instead of per-query monitoring
Examples
Detecting a connection leak in a web application: Transaction count steadily increases over hours while commit rate remains flat. Monitoring reveals hundreds of idle in transaction sessions from the application server. Root cause: missing connection.close() in error handling paths. Resolution: terminate stale sessions and fix application connection management.
Identifying lock contention during peak hours: Dashboard shows lock wait count spiking from 0 to 50+ between 2-4 PM daily. Lock analysis reveals a nightly reporting query overlapping with high-volume order processing. Resolution: reschedule reporting queries to off-peak hours and add NOWAIT hints to critical transaction paths.
Tracking transaction rollback ratio spike: Rollback ratio jumps from 1% to 15% after a deployment. Transaction monitor logs show serialization failures on a frequently updated inventory table. Resolution: reduce transaction isolation level from SERIALIZABLE to READ COMMITTED for non-critical paths and add retry logic for serialization failures.
Resources
-
PostgreSQL monitoring views: https://www.postgresql.org/docs/current/monitoring-stats.html
-
MySQL performance schema: https://dev.mysql.com/doc/refman/8.0/en/performance-schema.html
-
MongoDB currentOp: https://www.mongodb.com/docs/manual/reference/method/db.currentOp/
-
pg_stat_statements extension: https://www.postgresql.org/docs/current/pgstatstatements.html
-
Lock monitoring best practices: https://wiki.postgresql.org/wiki/Lock_Monitoring