Background Jobs
No one likes waiting in line. This is especially true of your website: users don't want to wait for things to load that don't directly impact the task they're trying to accomplish. For example, sending a "welcome" email when a new user signs up. The process of sending the email could take as long or longer than the sum total of everything else that happens during that request. Why make the user wait for it? As long as they eventually get the email, everything is good.
Concepts
A typical create-user flow could look something like this:
If we want the email to be sent asynchronously, we can shuttle that process off into a background job:
The user's response is returned much quicker, and the email is sent by another process, literally running in the background. All of the logic around sending the email is packaged up as a job and a job worker is responsible for executing it.
Each job is completely self-contained and has everything it needs to perform its own task.
Overview
There are three components to the Background Job system in Redwood:
- Scheduling
- Storage
- Execution
Scheduling is the main interface to background jobs from within your application code. This is where you tell the system to run a job at some point in the future, whether that's:
- as soon as possible
- delay for an amount of time before running
- run at a specific datetime in the future
Storage is necessary so that your jobs are decoupled from your running application. The job system interfaces with storage via an adapter. With the included PrismaAdapter
, jobs are stored in your database. This allows you to scale everything independently: the api server (which is scheduling jobs), the database (which is storing the jobs ready to be run), and the job workers (which are executing the jobs).
Execution is handled by a job worker, which takes a job from storage, executes it, and then does something with the result, whether it was a success or failure.
When scheduling a job, you're really saying "this is the earliest possible time I want this job to run": based on what other jobs are in the queue, and how busy the workers are, they may not get a chance to execute this one particular job for an indeterminate amount of time.
The only thing that's guaranteed is that a job won't run any earlier than the time you specify.
Queues
Jobs are organized by a named queue. This is simply a string and has no special significance, other than letting you group jobs. Why group them? So that you can potentially have workers with different configurations working on them. Let's say you send a lot of emails, and you find that among all your other jobs, emails are starting to be noticeably delayed when sending. You can start assigning those jobs to the "email" queue and create a new worker group that only focuses on jobs in that queue so that they're sent in a more timely manner.
Jobs are sorted by priority before being selected to be worked on. Lower numbers mean higher priority:
You can also increase the number of workers in a group. If we bumped the group working on the "default" queue to 2 and started our new "email" group with 1 worker, once those workers started we would see them working on the following jobs:
Quick Start
Start here if you want to get up and running with jobs as quickly as possible and worry about the details later.
Setup
Run the setup command to get the jobs configuration file created and migrate the database with a new BackgroundJob
table:
yarn rw setup jobs
yarn rw prisma migrate dev
This created api/src/lib/jobs.js
(or .ts
) with a sensible default config. You can leave this as is for now.
Create a Job
yarn rw g job SampleJob
This created api/src/jobs/SampleJob/SampleJob.js
and a test and scenario file. For now the job just outputs a message to the logs, but you'll fill out the perform()
function to take any arguments you want and perform any work you want to do. Let's update the job to take a user's id
and then just print that to the logs:
import { jobs } from 'src/lib/jobs'
export const SampleJob = jobs.createJob({
queue: 'default',
perform: async (userId) => {
jobs.logger.info(`Received user id ${userId}`)
},
})
Schedule a Job
You'll most likely be scheduling work as the result of one of your service functions being executed. Let's say we want to schedule our SampleJob
whenever a new user is created:
import { db } from 'src/lib/db'
import { later } from 'src/lib/jobs'
import { SampleJob } from 'src/jobs/SampleJob'
export const createUser = async ({ input }) => {
const user = await db.user.create({ data: input })
await later(SampleJob, [user.id], { wait: 60 })
return user
}
The first argument is the job itself, the second argument is an array of all the arguments your job should receive. The job itself defines them as normal, named arguments (like userId
), but when you schedule you wrap them in an array (like [user.id]
). The third argument is an optional object that provides a couple of options. In this case, the number of seconds to wait
before this job will be run (60 seconds).
Executing a Job
Start the worker process to find jobs in the DB and execute them:
yarn rw jobs work
This process will stay attached to the terminal and show you debug log output as it looks for jobs to run. Note that since we scheduled our job to wait 60 seconds before running, the runner will not find a job to work on right away (unless it's already been a minute since you scheduled it!).
That's the basics of jobs! Keep reading to get a more detailed walkthrough, followed by the API docs listing all the various options. We'll wrap up with a discussion of using jobs in a production environment.