Scheduling Jupyter notebooks
Scheduling Jupyter notebooks means running them automatically at specified times. This allows you or your team to rely on the results of a notebook without having to manually run it. For example, you could keep a dataset up to date by periodically pulling from a data source. Or you could run a notebook that generates a report and send it to your team over email or Slack.
Jupyter has no mechanism for scheduling notebooks. You can use additional tooling to do this for you, or you could use a Jupyter-compatible tool that has scheduling built-in.
Scheduling Jupyter notebooks locally
To schedule a Jupyter notebook, you’ll need to write a script that runs on the schedule you want. This is usually done using cron to schedule, and something like nbconvert or papermill to execute the notebook.
For example, this blog post shows how to use papermill and cron to schedule running a notebook.
The risk with this is that the computer needs to always be running. If you run this script on your local computer, for example, every time you turn the computer off or log out the schedule will stop running. You’ll need to run this script in the cloud to get it running reliably.
Use a Jupyter-compatible notebook with scheduling built-in
It’s much easier, and much more reliable, to use a tool with scheduling built-in. These tools let you write Jupyter notebooks like you’re familiar with, and have an easy-to-use UI for creating schedules to run notebooks on. Most of them also notify you if something goes wrong.
Below are some notebook tools that are Jupyter-compatible and have first class support for scheduling.