Data Science Notebooks

Data science gets done in notebooks. This website exists to compare the features in different data science notebook tools.

Robert Lacok
About the author

My name is Robert Lacok, and I’m a data notebook enthusiast. Because I keep on top of the latest developments in the space, I wanted to share it with the world.

I’m also a product manager at Deepnote. I try to be unbiased — if you believe any tools are missing or misrepresented, please email me or open a pull request on GitHub.

Need help?

If you need help picking a data notebook for your next project, feel free to reach out to me at my personal email address. I’d be happy to chat about the pros and cons of each solution.

View all notebooks

Scheduling Jupyter notebooks

Scheduling Jupyter notebooks means running them automatically at specified times. This allows you or your team to rely on the results of a notebook without having to manually run it. For example, you could keep a dataset up to date by periodically pulling from a data source. Or you could run a notebook that generates a report and send it to your team over email or Slack.

Jupyter has no mechanism for scheduling notebooks. You can use additional tooling to do this for you, or you could use a Jupyter-compatible tool that has scheduling built-in.

Show me the tools

Scheduling Jupyter notebooks locally

To schedule a Jupyter notebook, you’ll need to write a script that runs on the schedule you want. This is usually done using cron to schedule, and something like nbconvert or papermill to execute the notebook.

For example, this blog post shows how to use papermill and cron to schedule running a notebook.

The risk with this is that the computer needs to always be running. If you run this script on your local computer, for example, every time you turn the computer off or log out the schedule will stop running. You’ll need to run this script in the cloud to get it running reliably.

Use a Jupyter-compatible notebook with scheduling built-in

It’s much easier, and much more reliable, to use a tool with scheduling built-in. These tools let you write Jupyter notebooks like you’re familiar with, and have an easy-to-use UI for creating schedules to run notebooks on. Most of them also notify you if something goes wrong.

Below are some notebook tools that are Jupyter-compatible and have first class support for scheduling.

A screenshot of Deepnote

Deepnote

Deepnote is a new kind of data notebook that’s built for collaboration — Jupyter compatible, works magically in the cloud, and sharing is as easy as sending a link.

WebsiteAlternativesExamples
A screenshot of Hex

Hex

The Data Workspace for Teams. Work with data in collaborative SQL and Python notebooks. Share as interactive data apps that anyone can use.

A screenshot of Databricks Notebooks

Databricks Notebooks

Collaborate across engineering, data science, and machine learning teams with support for multiple languages, built-in data visualizations, automatic versioning, and operationalization with jobs.

A screenshot of Jetbrains Datalore

Jetbrains Datalore

A powerful online environment for Jupyter notebooks. Use smart coding assistance for Python in online Jupyter notebooks, run code on powerful CPUs and GPUs, collaborate in real-time, and easily share the results.

A screenshot of Nextjournal

Nextjournal

Runs anything you can put into a Docker container. Improve your workflow with polyglot notebooks, automatic versioning and real-time collaboration. Save time and money with on-demand provisioning, including GPU support.

A screenshot of Noteable

Noteable

Noteable is a collaborative notebook platform that enables teams to use and visualize data, together.

Data Science Notebooks

Popular notebook tools

DeepnoteGoogle ColabHexJupyterJetbrains DataloreSagemaker