Showing posts from February, 2023

How we lost all user data on our Jupyter Notebook service

  On Tuesday the 21st of February we did some maintenance on a Kubernetes cluster that hosts our Jupyter notebook service . This maintenance resulted in all users data that wasn’t actively being used being deleted. At the time of the maintenance this was all of our users. In preparation for moving our Kubernetes cluster to some different hardware we needed to shrink our Kubernetes workers to allow for some more space on the underlying hypervisors. So one by one we drained and deleted a worker and then created a new smaller worker and joined it back into the cluster. This all worked smoothly and the JupyterHub service was unaffected except when we needed to move the hub process. Unfortunately you can only run one so there was about 1 minute where the web interface is down.     Once the rolling rebuild of our workers was complete we made sure the service was working as expected. All looked fine except we noticed our volumes that are attached to each user's pod were empty and fresh