Data Science Notebooks

Data science gets done in notebooks. This website exists to compare the features in different data science notebook tools.

Robert Lacok
About the author

My name is Robert Lacok, and I’m a data notebook enthusiast. Because I keep on top of the latest developments in the space, I wanted to share it with the world.

I’m also a product manager at Deepnote. I try to be unbiased — if you believe any tools are missing or misrepresented, please email me or open a pull request on GitHub.

Need help?

If you need help picking a data notebook for your next project, feel free to reach out to me at my personal email address. I’d be happy to chat about the pros and cons of each solution.

View all notebooks

Version control in Jupyter notebooks

When you’re writing a Jupyter notebook, it’s useful to track changes. This means that you can go back to a previous version of the notebook, or compare different versions. Just like any document, it means that you can make changes without worrying about losing your previous work.

Using Git works to version control Jupyter notebooks, but there are more ergonomic options when using other Jupyter-compatible tools.

Show me the tools

Using Git to version control Jupyter notebooks

Jupyter notebooks are just files, so the default option is to version control with Git. This works especially well for production systems or libraries where there is already code being tracked with Git. It fits right in with your existing workflow.

However, this comes with several drawbacks. Git is a tool made for software engineers working primarily in text files. It’s not designed for the specific needs of data scientists or the specific needs of Jupyter notebooks.

  • You have to remember to make commits, otherwise, your changes won’t be tracked. When you’re working in a notebook, you’re often making small changes and running code. You might not want to commit every time you run a cell.
  • You have to remember to sync with a remote repository, otherwise, your changes won’t be backed up or you will conflict with other people’s changes.
  • By default, diffs come up as ugly JSON, which is hard to read.
  • Resolving conflicts is almost impossible without specialized tooling.
A diff of a Jupyter notebook

This is the reality of comparing notebooks in Git without help from more software.

Extra tooling will make your life easier if you want to go down this route. For example, nbdev has impressive capabilities for resolving conflicts and JupyterLab has a Git extension.

Use a Jupyter-compatible notebook with version control built-in

If you don’t need the software engineering discipline that Git offers, there are other options. There are fully managed notebook tools that just have version control built in. You make a change to the notebook, and it’s automatically saved. You can go back to previous versions, and you can see a diff of the changes.

This is a great option for data scientists who want to focus on the data science, not the software engineering.

The best tools are the ones that offer realtime collaboration as well as versioning. This way, you never have to worry about conflicting with other people or writing over each other’s work.

Below are some notebook tools that are Jupyter-compatible, have version control, and have realtime collaboration.

A screenshot of Deepnote

Deepnote

Deepnote is a new kind of data notebook that’s built for collaboration — Jupyter compatible, works magically in the cloud, and sharing is as easy as sending a link.

WebsiteAlternativesExamples
A screenshot of Hex

Hex

The Data Workspace for Teams. Work with data in collaborative SQL and Python notebooks. Share as interactive data apps that anyone can use.

A screenshot of Databricks Notebooks

Databricks Notebooks

Collaborate across engineering, data science, and machine learning teams with support for multiple languages, built-in data visualizations, automatic versioning, and operationalization with jobs.

A screenshot of DataCamp Workspace

DataCamp Workspace

DataCamp Workspace is an AI-powered data notebook to help you get from data to insights, faster.

A screenshot of CoCalc

CoCalc

Your best choice for teaching remote scientific courses.

A screenshot of Jetbrains Datalore

Jetbrains Datalore

A powerful online environment for Jupyter notebooks. Use smart coding assistance for Python in online Jupyter notebooks, run code on powerful CPUs and GPUs, collaborate in real-time, and easily share the results.

A screenshot of Nextjournal

Nextjournal

Runs anything you can put into a Docker container. Improve your workflow with polyglot notebooks, automatic versioning and real-time collaboration. Save time and money with on-demand provisioning, including GPU support.

A screenshot of Noteable

Noteable

Noteable is a collaborative notebook platform that enables teams to use and visualize data, together.

Data Science Notebooks

Popular notebook tools

DeepnoteGoogle ColabHexJupyterJetbrains DataloreSagemaker