Skip to the content.

Home

Previous: Chapter 15 - The Index


📝 As an app user, I want to toggle the archived status of all users, so that I can run a ghost round.

Sometimes you want to update records instead of fetching them. Of course, this too might become an issue at scale. Our current request is a simple but costly update.

Open up a console.

Terminal

bin/rails c

Reset the archived status of all users and benchmark the update.

Rails Console

> puts Benchmark.measure { User.update_all("archived = NOT archived") }
  User Update All (19994.1ms)  UPDATE "users" SET archived = NOT archived
  0.017568   0.006330   0.023898 ( 20.049190)

On my machine, this run took about 20 seconds. Not so bad, right? Common Rails wisdom says to perform large updates like this in batches because of locks and such, but we don’t have to take anybody’s word for it. Let’s see for ourselves by running some updates and monitoring locks. Please note that for this exercise you’ll be opening three different database consoles.

Start the first database console.

Terminal

bin/rails dbconsole

Inspect the existing locks.

PostgreSQL Console (1)

=# SELECT relation::regclass, * FROM pg_locks;
 relation |  locktype  | database | relation | page | tuple | virtualxid | transactionid | classid | objid | objsubid | virtualtransaction |  pid  |      mode       | granted | fastpath | waitstart
----------+------------+----------+----------+------+-------+------------+---------------+---------+-------+----------+--------------------+-------+-----------------+---------+----------+-----------
 pg_locks | relation   |  1771898 |    12290 |      |       |            |               |         |       |          | 3/6104             | 19841 | AccessShareLock | t       | t        |
          | virtualxid |          |          |      |       | 3/6104     |               |         |       |          | 3/6104             | 19841 | ExclusiveLock   | t       | t        |
(2 rows)

You can see a couple of existing locks, but everything looks normal. For the rest of the chapter we’ll focus only on locks that have not been granted, but you’re encouraged to keep inspecting all locks at every step of the way to see what’s happening.

Look for locks that have yet to be granted.

PostgreSQL Console (1)

=# SELECT relation::regclass, * FROM pg_locks WHERE NOT granted;
 relation | locktype | database | relation | page | tuple | virtualxid | transactionid | classid | objid | objsubid | virtualtransaction | pid | mode | granted | fastpath | waitstart
----------+----------+----------+----------+------+-------+------------+---------------+---------+-------+----------+--------------------+-----+------+---------+----------+-----------
(0 rows)

Nothing here as expected.

Start a second console.

Terminal

bin/rails dbconsole

Begin a large update, promoting everyone to captain.

PostgreSQL Console (2)

=# begin;
=# UPDATE "users" SET name = 'Captain ' || name;
UPDATE 1000000

What we’ve done is began a transaction and performed an update of every row in the table. To perform the update, the transaction has acquired some number of locks. These locks will not be released until the transaction ends.

Go back to the first console and look for any locks that have yet to be granted.

PostgreSQL Console (1)

=# SELECT relation::regclass, * FROM pg_locks WHERE NOT granted;
 relation | locktype | database | relation | page | tuple | virtualxid | transactionid | classid | objid | objsubid | virtualtransaction | pid | mode | granted | fastpath | waitstart
----------+----------+----------+----------+------+-------+------------+---------------+---------+-------+----------+--------------------+-----+------+---------+----------+-----------
(0 rows)

Empty again. So far so good.

Start a third console.

Terminal

bin/rails dbconsole

Confirm that the locks do not affect data reads.

PostgreSQL Console (3)

=# SELECT name FROM users WHERE id = 1;
       name
------------------
 y4L2zNAefXtVDYGI
(1 row)

Reading data works fine, but notice we do get the “old” name. The update transaction has not committed any data.

Try a quick update of a single row.

PostgreSQL Console (3)

=# UPDATE "users" SET name = 'Major ' || name WHERE id = 1;

What’s happening here? No matter how long we wait this command does not return.

Go back to the first console and look for any locks that have yet to be granted.

PostgreSQL Console (1)

=# SELECT relation::regclass, * FROM pg_locks WHERE NOT granted;
 relation |   locktype    | database | relation | page | tuple | virtualxid | transactionid | classid | objid | objsubid | virtualtransaction |  pid  |   mode    | granted | fastpath |           waitstart
----------+---------------+----------+----------+------+-------+------------+---------------+---------+-------+----------+--------------------+-------+-----------+---------+----------+-------------------------------
          | transactionid |          |          |      |       |            |       1187017 |         |       |          | 5/181              | 19947 | ShareLock | f       | f        | 2022-07-29 16:01:27.551946-04
(1 row)

Aha! We do have a lock that’s waiting to be granted.

Go back to the second console and finally commit the transaction.

PostgreSQL Console (2)

=# commit;

Go back to the first console and look for any locks that have yet to be granted.

PostgreSQL Console (1)

=# SELECT relation::regclass, * FROM pg_locks WHERE NOT granted;
 relation | locktype | database | relation | page | tuple | virtualxid | transactionid | classid | objid | objsubid | virtualtransaction | pid | mode | granted | fastpath | waitstart
----------+----------+----------+----------+------+-------+------------+---------------+---------+-------+----------+--------------------+-----+------+---------+----------+-----------
(0 rows)

Back in the clear!

Checking the third console, you’ll see that your last command was able to return successfully.

PostgreSQL Console (3)

UPDATE 1

Check the user’s final name.

PostgreSQL Console (3)

=# SELECT name FROM users WHERE id = 1;
              name
--------------------------------
 Major Captain y4L2zNAefXtVDYGI
(1 row)

What I personally take from this exercise is that the problem isn’t really with locking a large number of rows or even the entire table. The problem is the amount of time that we’re holding locks and blocking other operations. Think of a drive-through or any other queue. If a car comes through the drive-through and makes a single order with 100 items, then the cars behind them will be blocked until that order is complete. Since the system is designed to process a large amount of orders, the same car could split their order into batches of smaller orders. Spacing those orders out throughout the day would cause little-to-no disruption to the system. The same principle applies here. What we should do is split our request into a set of smaller requests that the system can easily handle. Our goal is to obtain the lock, do the work, and release the lock in a reasonable amount of time.

Toggle archived status again, but use in_batches.

Rails Console

> puts Benchmark.measure { User.in_batches.update_all("archived = NOT archived") }

This might run in about the same amount of time as before, but this time our updates were broken up into smaller chunks that didn’t clog up the drive-through.

âś… As an app user, I want to toggle the archived status of all users, so that I can run a ghost round.


Next: Thank You