slower to build or slower to run?

merb · on Sept 25, 2023

slower to run, but when you keep the postgres connection open you will know that the job is still running, while with for update skip locked you would need to have a status and a job_timeout basically.

so pg_try_advisory_lock/pg_advisory_unlock can lock over transactions while for update skip locked can't, thus you would either need to keep a transaction open or use status+job_timeout (and in postgres you should not use long transactions)

basically we use c#, but we looked into https://github.com/que-rb/que which uses advisory_locks, since our jobs take like 1 min to 2 hours it was a no-brainer to use advisory_locks. it's just not the best thing if you have thousands of fast jobs per second, but for a more moderate queue where you have like 10000 jobs per minute/10 minutes/30 minutes and they take like 1 min to 2 hours its fine.

we also do not delete jobs, we do not care about storage since the job table basically does not take a lot. and we have a lot of time to catchup at night since we are only in europe

ekidd · on Sept 25, 2023

Here's my current favorite recipe for building complex job systems on PostgreSQL. I'm not thinking about "send an email"-type jobs, but bigger jobs that do complex tasks.

The usual trick I use is to have a `jobs.state` field containing "pending", "running", "done", or "error" (or whatever that job system needs). I only hold SELECT FOR UPDATE SKIPPED LOCKED long enough to:

1. Transition from "pending" to "running". Or a second time, to transition from "running" to either "done" or "error".

2. Store the current worker ID (often the Kubernetes pod name).

Then, I can build a watcher that wakes up every 5 minutes, and looks for "running" jobs with no corresponding Kubernetes pod, and mark them as "error". I try to never hold a lock for more than a second or two, and to never lock more than one job at once. This gets me 80% of the way there.

The reason I don't hold a transaction open for the entire job is because every transaction requires a PostgreSQL connection, and connections are surprisingly expensive. In fact, you may want to run connections through pgbouncer or a stateless REST API to avoid holding open hundreds or thousands of connections. Everything except PostgreSQL itself should ideally be stateless and restartable.

You might also have a retry system, or jobs that recursively queue up child jobs, or jobs that depend on other jobs, or more elaborate state machines. You might have timeouts. Most of these things are solveable with some mix of transations, some SQL, CREATE INDEX or CREATE VIEW. A database gives you so many things for free, if you're just a little careful about it.

And since Grafana supports SQL, you can easily build really nice dashboards for support and ops by just querying the tables used by the job system.

There are other ways to do it! But I'm fond of this general strategy for coarse-granularity jobs.

runeks · on Sept 26, 2023

Isn’t this just https://news.ycombinator.com/item?id=29599132 with the addition of a custom worker ID (pod name)?

0x457 · on Oct 2, 2023

IMO the main issue with it - advisory locks in postgres require an open connection being held the entire time lock is taken. Combine that with thread per connection model...