Early Exploration with Docker & Kubernetes

Dec 22, 2018 Today I began to structure and fill in the basic information for the Open Science/Docker Documentation.

12/24/2019 - 01/01/2019

Today I started by watching a video to be aware of the problems I might face going forward with JupyterHub. I watched someone from Berkeley describe how they’ve scaled to thousands of users across many professors and managed to not grow their IT team while doing so.

I will start by using this space here to take notes on the following video(s):

Deploying Jupyter Notebooks for Students and Researchers (PyData 2016)


Managing a 1000+ Student JupyterHub without Losing your Sanity (JupyterCon 2018)

Here are the stated goals in the video:

  • [7:30] Infrastructure Shouldn’t Bottleneck
    • Instructors should be able to install packages as needed
  • [8:00] Anyone can Deploy
    • Treat Admins as Equal (grad students, teachers, etc)
  • [9:00] Automate Workflows
    • Avoid manual processes. No one-off scripts to bootstrap solutions
  • [9:30] Reduce Human Maintenance
    • Build on top of tools that already exist
    • Let academic edit Dockerfile directly on Github, issue pull-request.
    • Travis Process builds image, and if successful, requests to merge
    • Rather than CLI, hit a couple buttons to set off other Travis Processes to update Helm
  • [13:55]: Reproducibility
    • Tag versions, everything.
    • The release cycles of the core infrastructure components do not align with each other.
    • In pip file, in Dockerfile, all of it.
    • Get git hashes for latest commits.
    • Tag Docker images with the hashes of the repositories that generated them
    • This makes it much easier to re-deploy!
  • [15:30] Horror Story of Hub failing during Finals week because a Cloud bill was unpaid (grant ran out)
    • All the versioning helped saved them
    • Despite coming up on different nodes, the same hub came out of it.
  • [17:00] Observability
    • Monitoring (container provider may have analytics)
    • This may be less of a problem for us if we deploy on our own servers.
    • Figure out needs of students to figure out how many nodes.
    • After observing, they noticed most students used under 1GB, allowing them to double capacity per node. Students that went above were writing runaway processes anyway.
  • [19:30] Incident Reporting
    • any time something goes down, write a report.
    • [Incident Reports at Berkeley][github.com/data-8/infrastructure]
    • Improves deployment.
    • Event summary, timeline, conclusion, and action items.
    • Reports are “blameless” to encourage transparency with relevant details
  • [21:30] Generalization
    • Be a good open-source citizen
    • This means digging into problems and reporting bugs
    • Not every problem is your fault. Sometimes an upstream thing breaks it, and if you report it, you may save other people time/effort.
    • Don’t rely on forks and custom patches. Contribute. Think about reproducible deployment scenarios.
  • [24:30] Contact Information (encouraged)

Scaling JupyterHub to many users (PyData Tel-Aviv 2018)


“JupyterHub and JupyterLab: Perfect Together (PyData 2018)”


JupyterHub with Kubernetes:




Log

A lot of solutions exist for using docker to handle the file storage and user authentication.

Here is the solution that I think will work best for workshops:

Open DreamKit


[11:37 AM] Pilosov, Michael

Okay, can’t confirm until I try it, but I am reasonably certain that while this set-up is useful in many contexts. It doesn’t address the teacher/student/group collab management that we quite wanted (and that I focused on getting working first and have done already). ​

[11:39 AM] Pilosov, Michael

The way the build works here with data management is actually much “slimmer” in fact, and I think there’s a use for it for us for sure… but I suspect it doesn’t allow two users to share data, or a super-user to poke around in files. ​

[11:41 AM] Pilosov, Michael

But it would allow for secure authentication for like, a workshop, just using people’s github accounts, and data-persistence and management would be more ideal in that sense since you can turn on permanence for each user, destroy it when the workshop is over as superuser (but without ever being able to poke around in their files, which is a nice security feature for the user I suppose…).

[11:52 AM] Pilosov, Michael

In short, what we want instead is for docker to launch up a virtual LINUX machine, where all memory persists within the container. It’s as if you set up a “new computer” for each class/workshop. Each class gets its own port. Importantly, this makes “management” very familiar, identical to what you already know how to do, rather than having to “learn docker” to manage student’s work. also, updating things can be done this way without re-building images through docker. which is really nice for on-the-fly changes.

What’s the downside you ask? It’s definitely less memory efficient, but it’s not a problem at all. we’re talking like… cheap laptop levels of storage, nothing major.


[12:20 AM] Finished watching the first video and going throught he tutorial above for the most part. I still think the Linux within each container is the simplest option for someone to manage on their own. it is definitely the most familiar environment.

To start, let me ssh into my workstation and see if I can revive the docker container from the summer workshop. I will log my commands here!

# start the server
docker run -td -p 80:8000 --name=labhub mathematicalmichael/labhub-test

# check that it is up
docker ps

# look for ip under process associated with your machine. look for 
# inet addr: <XXX.XXX.XXX.XXX>

Then we connect to auraria-anywhere via Cisco AnyConnect and attempt to connect to :8000 via our browser.

This site can’t be reached

Debug mode:

docker logs labhub

showed me that

Creating 3 new users...
Created user0 with password Breckenridge0_g2s3
Created user1 with password Breckenridge1_g2s3
Created user2 with password Breckenridge2_g2s3
*** Running /etc/rc.local...
*** Booting runit daemon...
*** Runit started as PID 66
*** Running jupyterhub --Spawner.cmd='jupyter-labhub' --no-ssl...
*** jupyterhub --Spawner.cmd='jupyter-labhub' --no-ssl exited with status 127.
*** Shutting down runit daemon (PID 66)...
*** Killing all processes...

So that tells me I messed up the .Dockerfile. It might be instructive to just learn from what they did. I don’t really need their libraries (which is what I believe was causing the build to fail before).

** It will definitely be better to start with a fresh Linux image and go from there, since we want to build on top of the newest release anyway, and make sure we understand how its done, keep images/layers light. **

From MUQ-Hippylib (see muq for more info)

FROM quay.io/fenicsproject/stable:2017.2.0.r3
MAINTAINER U. Villa

USER root

RUN apt-get update && \
    apt-get install -yy pwgen npm nodejs-legacy python3-pip libgeos-dev&& \
    npm install -g configurable-http-proxy && \
    pip3 install jupyterhub==0.8.1 && \
    pip3 install ipython[notebook]==6.2.1 h5py pandas && \
    pip install --user https://github.com/matplotlib/basemap/archive/master.zip

RUN apt-get clean && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*

#RUN mkdir /etc/certs
#RUN touch /etc/certs/ssl.key
#RUN touch /etc/certs/ssl.crt
#RUN openssl req -x509 -nodes -days 730 -newkey rsa:2048 \
#                 -subj "/C=XY/ST=XYZ/L=XYZ/O=XYZ/CN=example.com" \
#                 -keyout /etc/certs/ssl.key -out /etc/certs/ssl.crt

USER fenics

# Install MUQ
RUN cd /home/fenics && \
    mkdir Installations; mkdir Installations/MUQ_INSTALL && \
    git clone --depth 1 https://bitbucket.org/mituq/muq2.git && \
    cd muq2/; mkdir build; cd build;  \
    cmake -DCMAKE_INSTALL_PREFIX=/home/fenics/Installations/MUQ_INSTALL -DMUQ_USE_PYTHON=ON ../ && \
    make install

# Install hIPPYlib
RUN cd /home/fenics/Installations && \
    git clone https://github.com/hippylib/hippylib.git && \
    chmod -R o+rx hippylib

# Copy the notebooks
RUN cd /home/fenics/Installations && \
    git clone https://github.com/g2s3-2018/labs.git

COPY python3_config.json /usr/local/share/jupyter/kernels/python3/kernel.json
ENV LD_LIBRARY_PATH /home/fenics/Installations/MUQ_INSTALL/lib:/home/fenics/Installations/MUQ_INSTALL/muq_external/lib
ENV PYTHONPATH /home/fenics/Installations/MUQ_INSTALL/lib:/home/fenics/Installations/hippylib

USER root

COPY jupyterhub_config.py /home/fenics/jupyterhub_config.py
COPY make-users-std-password.sh /etc/my_init.d/make-users-std-password.sh
RUN chmod +x /etc/my_init.d/make-users-std-password.sh
RUN rm /etc/my_init.d/set-home-permissions.sh
COPY update_lab.sh /home/fenics/update_lab.sh
RUN chmod +x /home/fenics/update_lab.sh
RUN mkdir -p /home/fenics/.jupyter
COPY jupyter_notebook_config.py /home/fenics/.jupyter/jupyter_notebook_config.py


ENV NUMBER_OF_USERS 60
WORKDIR /home/fenics/
ENTRYPOINT ["/sbin/my_init","--"]
CMD ["jupyterhub"]

So it appears that we’re starting with a root image from the Fenics Project. Quay.io has a summary of security vulnerabilities for each repository tag.

The one used by Dr. Umberto Villa had 3 High-Risk Vulnerabilities and 151 fixable. More modern ones such as 2018.1.0.r3 have only 57 Medium-Risk ones and 25 fixable.

This is a considerable advantage and thus I believe the correct place to start from. It is also evident that the developers have been working on reducing image sizes, with a clear downward-trend each time a new one comes up. The latest does also seem to be built on top of the Ubuntu 18.04, which is exactly what we wanted!!!

Wonderful.

Okay, but before I go on building my customized version[^builderrors], I want to remember how to get the hippylib-muq one working. I have the Dockerhub image from mparno/hippylib-muq and can run it (create an instance) with

docker run -td -p 80:8000 --name=labhub mparno/muq-hippylib

So this creates an instance that will map to port 80 (default address when you land on an IP, something we should map with sub-domains later on to have classhub.math.ucdenver.edu rather than ports to memorize). This “instance” (container) is now created and the ports mapped, so now we have to start it.

docker start labhub

# now can go to base IP address and log in! 

# later on... stop it.
docker stop labhub

Alright, that’s fine and well. We have a baseline set of instructions to go off now, and can learn from. However, seeing as quite a lot is happening in those instructions, and no SSL security has been established, I would like to perhaps instead start with Dockerfiles from the Jupyter Project. Moreover, these will be newer versions and should come with Lab enabled already since it is finally in a stable release.

[^builderrors] I remember that some updates to MUQ… likely because of a lack of pinning to specific releases, caused build errors when I tried to start up my labhub. Though it might have been the initial commands, based on the readout above.

SO we’ll have to do two things:

  • Get Hub/Lab up and running with a single user based on the instructions from Project Jupyter.
  • Use this on top of the Fenics build to get import dolfin working inside of the Hub.

NOTE The ease of sharing memory for class-uses is actually a BRAND NEW effort from Jupyter (a week old?). See here, which should remove the need for the solution we’re building, allow for much better and more lightweight scalability.

I just came across dockerspawner which can be enabled with the following in jupyterhub_config.py:

c.JupyterHub.spawner_class = 'dockerspawner.DockerSpawner'

More appropriate for the use case we have is SystemUserSpawner

If you want to spawn notebook servers for users that correspond to system users, you can use the SystemUserSpawner instead. Add the following to your jupyterhub_config.py:

c.JupyterHub.spawner_class = 'dockerspawner.SystemUserSpawner' The SystemUserSpawner will also need to know where the user home directories are on the host. By default, it expects them to be in /home/<username>, but if you want to change this, you’ll need to further modify the jupyterhub_config.py. For example, the following will look for a user’s home directory on the host system at /volumes/user/<username>:

c.SystemUserSpawner.host_homedir_format_string = '/volumes/user/{username}'

For a full example of how SystemUserSpawner is used, see the compmodels-jupyterhub (warning: old) repository (this additionally runs the JupyterHub server within a docker container, and authenticates users using GitHub OAuth).


I believe the correct entry-point for us will actually be here, at Project Jupyter’s Deploy-Docker repository.

jupyterhub-deploy-docker provides a reference deployment of JupyterHub, a multi-user Jupyter Notebook environment, on a single host using Docker. This deployment is NOT intended for a production environment. It is a reference implementation that does not meet traditional requirements in terms of availability nor scalability.

schematic

Okay, so this is actually a great solution, but it’s the same thing more/less as the “french” solution, with the encryption being handled directly by Jupyterhub.

Since we want to be building Linux images and then running a Hub on each one, we basically just need to paste installation instructions into the Dockerfile.

conda install -c conda-forge jupyterhub
conda install jupyterlab

Docker Jupyter Stacks

However, I then came across Jupyter Docker Stacks, which appear to be a number of recipes I can start with.

So let’s go ahead and try this on the server.

Example 2: This command pulls the jupyter/datascience-notebook image tagged 3772fffc4aa4 from Docker Hub if it is not already present on the local host. It then starts an ephemeral container running a Jupyter Notebook server and exposes the server on host port 10000. The command mounts the current working directory on the host as /home/jovyan/work in the container. The server logs appear in the terminal. Visiting http://:10000/?token= in a browser loads JupyterLab, where hostname is the name of the computer running docker and token is the secret token printed in the console. Docker destroys the container after notebook server exit, but any files written to ~/work in the container remain intact on the host.

docker run --rm -p 10000:8888 -e JUPYTER_ENABLE_LAB=yes -v "$PWD":/home/jovyan/work jupyter/datascience-notebook:3772fffc4aa4

My workstation was unable to find the image locally so it began to pull it from docker-cloud. This took a moment. Appears to be several gb large.

Okay, well I managed to connect to it by visiting <IP>:10000/?token=XXXX...XXXX, where I grabbed the token from the output of the terminal window.

This is great functionality, but it’s not the hub we’re looking for. That said, the included dockerfiles are very instructive.

Here is the dependency list. Each one is a root image of the next.

base-notebook > minimal-notebook > scipy-notebook > datascience-notebook

The last install, the [datascience-notebook][https://github.com/jupyter/docker-stacks/blob/master/datascience-notebook/Dockerfile] includes R and Julia on top of Python. The heavy HDF5 dependency for Julia is not included if the build is in test-mode (see line 10 and 73). It will also link the kernels together. As tempting as it is to start here, I’d like to start earlier.

The scipy-notebook features installs of widgets and more, and is the minimum requirement for most of what we need.

Something I noticed was that it includes an old version of pip. If you launch the container, it mounts whatever directory you were in when you executed the docker run command to a “fake one” inside docker (so pwd returns /home/jovyan/work). This is actually kind of nice. If volumes of students can be mounted by the professor to have a look around, that would be great. That would, of course, require learning some docker. Students can even install additional libraries as needed, but they will disappear next time they connect (though the files stick around!).

What we need to do (I think), is merge this Dockerfile with the JupyterHub deployments. I think this just comes down to configuring the authenticator correctly (and referencing Umberto’s file above).

Let’s look at the base-notebook, since it seems that scipy-notebook just adds bells and whistles (widgets already is calling jupyterlab extension manager). And yes, it appears hub is installed!

# Copyright (c) Jupyter Development Team.
# Distributed under the terms of the Modified BSD License.

# Ubuntu 18.04 (bionic) from 2018-05-26
# https://github.com/docker-library/official-images/commit/aac6a45b9eb2bffb8102353c350d341a410fb169
ARG BASE_CONTAINER=ubuntu:bionic-20180526@sha256:c8c275751219dadad8fa56b3ac41ca6cb22219ff117ca98fe82b42f24e1ba64e
FROM $BASE_CONTAINER

LABEL maintainer="Jupyter Project <jupyter@googlegroups.com>"
ARG NB_USER="jovyan"
ARG NB_UID="1000"
ARG NB_GID="100"

USER root

# Install all OS dependencies for notebook server that starts but lacks all
# features (e.g., download as all possible file formats)
ENV DEBIAN_FRONTEND noninteractive
RUN apt-get update && apt-get -yq dist-upgrade \
 && apt-get install -yq --no-install-recommends \
    wget \
    bzip2 \
    ca-certificates \
    sudo \
    locales \
    fonts-liberation \
 && rm -rf /var/lib/apt/lists/*

RUN echo "en_US.UTF-8 UTF-8" > /etc/locale.gen && \
    locale-gen

# Configure environment
ENV CONDA_DIR=/opt/conda \
    SHELL=/bin/bash \
    NB_USER=$NB_USER \
    NB_UID=$NB_UID \
    NB_GID=$NB_GID \
    LC_ALL=en_US.UTF-8 \
    LANG=en_US.UTF-8 \
    LANGUAGE=en_US.UTF-8
ENV PATH=$CONDA_DIR/bin:$PATH \
    HOME=/home/$NB_USER

ADD fix-permissions /usr/local/bin/fix-permissions
# Create jovyan user with UID=1000 and in the 'users' group
# and make sure these dirs are writable by the `users` group.
RUN groupadd wheel -g 11 && \
    echo "auth required pam_wheel.so use_uid" >> /etc/pam.d/su && \
    useradd -m -s /bin/bash -N -u $NB_UID $NB_USER && \
    mkdir -p $CONDA_DIR && \
    chown $NB_USER:$NB_GID $CONDA_DIR && \
    chmod g+w /etc/passwd && \
    fix-permissions $HOME && \
    fix-permissions $CONDA_DIR

USER $NB_UID

# Setup work directory for backward-compatibility
RUN mkdir /home/$NB_USER/work && \
    fix-permissions /home/$NB_USER

# Install conda as jovyan and check the md5 sum provided on the download site
ENV MINICONDA_VERSION 4.5.11
RUN cd /tmp && \
    wget --quiet https://repo.continuum.io/miniconda/Miniconda3-${MINICONDA_VERSION}-Linux-x86_64.sh && \
    echo "e1045ee415162f944b6aebfe560b8fee *Miniconda3-${MINICONDA_VERSION}-Linux-x86_64.sh" | md5sum -c - && \
    /bin/bash Miniconda3-${MINICONDA_VERSION}-Linux-x86_64.sh -f -b -p $CONDA_DIR && \
    rm Miniconda3-${MINICONDA_VERSION}-Linux-x86_64.sh && \
    $CONDA_DIR/bin/conda config --system --prepend channels conda-forge && \
    $CONDA_DIR/bin/conda config --system --set auto_update_conda false && \
    $CONDA_DIR/bin/conda config --system --set show_channel_urls true && \
    $CONDA_DIR/bin/conda install --quiet --yes conda="${MINICONDA_VERSION%.*}.*" && \
    $CONDA_DIR/bin/conda update --all --quiet --yes && \
    conda clean -tipsy && \
    rm -rf /home/$NB_USER/.cache/yarn && \
    fix-permissions $CONDA_DIR && \
    fix-permissions /home/$NB_USER

# Install Tini
RUN conda install --quiet --yes 'tini=0.18.0' && \
    conda list tini | grep tini | tr -s ' ' | cut -d ' ' -f 1,2 >> $CONDA_DIR/conda-meta/pinned && \
    conda clean -tipsy && \
    fix-permissions $CONDA_DIR && \
    fix-permissions /home/$NB_USER

# Install Jupyter Notebook, Lab, and Hub
# Generate a notebook server config
# Cleanup temporary files
# Correct permissions
# Do all this in a single RUN command to avoid duplicating all of the
# files across image layers when the permissions change
RUN conda install --quiet --yes \
    'notebook=5.7.2' \
    'jupyterhub=0.9.4' \
    'jupyterlab=0.35.4' && \
    conda clean -tipsy && \
    jupyter labextension install @jupyterlab/hub-extension@^0.12.0 && \
    npm cache clean --force && \
    jupyter notebook --generate-config && \
    rm -rf $CONDA_DIR/share/jupyter/lab/staging && \
    rm -rf /home/$NB_USER/.cache/yarn && \
    fix-permissions $CONDA_DIR && \
    fix-permissions /home/$NB_USER

USER root

EXPOSE 8888
WORKDIR $HOME

# Configure container startup
ENTRYPOINT ["tini", "-g", "--"]
CMD ["start-notebook.sh"]

# Add local files as late as possible to avoid cache busting
COPY start.sh /usr/local/bin/
COPY start-notebook.sh /usr/local/bin/
COPY start-singleuser.sh /usr/local/bin/
COPY jupyter_notebook_config.py /etc/jupyter/
RUN fix-permissions /etc/jupyter/

# Switch back to jovyan to avoid accidental container runs as root
USER $NB_UID

It seems that the implementation at the Deploy-Docker repository actually handles multi-users, and you can CHOOSE ANY OF the aforementioned builds from Jupyter’s Page on Docker Cloud

I think I will try this approach. The key difference here is that users are isolated through where docker mounts their directories, not through permissions.

So the idea is that you have a Linux machine running docker. All the students files are in some directory there. The Hub spawns up a Docker container on-demand for each student, mounting them appropriately. A teacher can access the files since they are simply the user that is using Docker. (anyone with sudo permissions on the UNIX machine? or just part of the correct usergroup? Joe can help there.) When the teachers log in, their accounts get mounted in a place they can see the whole class, but not other instructor’s classes!

Virtualization can be used to segment classes, or they can run together. It doesn’t matter. Port-forwarding with sub-addresses can be used to direct to the correct hub for login. One hub runs per class. One docker instance manages all the hubs.

I’m going to try it this way first, rather than the isolated UNIX-machine way.


The “Correct Way”

We will be using dockerspawner, and the directions in their README.

Actually… It seems this is what the point of Deploy Docker is. Slight modifications are made so that the dockerspawner launches up our chosen base-image!

Choose a pinned image on docker cloud based on the builds from the stacks on github.

Since we really need all the stuff included in the scipy stack, and these images are no smaller than 2GB, we might as well spring for the datascience stack since they are the same size and include R and Julia, which our colleagues may appreciate.

As of this writing, here is a recent tag: 7254cdcfa22b

when running, be aware that no sudo priviledges will be granted. -e GRANT_SUDO=yes needs to be included upon docker run.

So, following the Deploy-Docker Dockerfile, I can see that it is running relatively recent versions of jupyterhub and dockerspawner, despite many files being unchanged for years. That’s actually kind of promising for maintenance purposes.

Starting from Zero… SSH into my main machine. Move into a directory to perform the cloning. Needed to first run

sudo apt install docker-compose

git clone https://github.com/jupyterhub/jupyterhub-deploy-docker.git
cd jupyterhub-deploy-docker

mkdir -p secrets
cd secrets
openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout mykey.key -out mycert.pem

# bad idea?
mv mycert.pem jupyterhub.crt 
mv mykey.key jupyterhub.key

cd ..
mv .env

# change line 19 to 
DOCKER_NOTEBOOK_IMAGE=jupyter/datascience-notebook:7254cdcfa22b

In jupyterhub_config.py add the lines

c.DockerSpawner.environment = { 'JUPYTER_ENABLE_LAB': 'yes' }

Now we face the authentication problem. We did generate a self-signed cert (hopefully), but the IP address of my workstation is not public. For extra security, best to keep it this way.

We want to change authentication to users… The problem is that I don’t want to be setting up github users accounts. Lines 56-57 in juptyerhub_config.py are the problem:

c.JupyterHub.authenticator_class = 'oauthenticator.GitHubOAuthenticator'
c.GitHubOAuthenticator.oauth_callback_url = os.environ['OAUTH_CALLBACK_URL']

I think by commenting it out, PAMAuthenticator is default. Should be good enough? Well now if we change this we will have to create user accounts to log in with.

Let’s create a userlist file in the root directory vim userlist

michael admin
troy admin
varis

Now to actually create these users. Let’s see how this was done in hippylib-hub:

#!/bin/bash

echo "Update Lab materials from GitHub"
cd /home/fenics/Installations/labs && git pull

echo "Creating ${NUMBER_OF_USERS} new users..."
for ((i = 0; i < ${NUMBER_OF_USERS}; i++));
do
    password="Breckenridge${i}_g2s3"
    useradd "user${i}" -m -s /bin/bash
    echo "user${i}:${password}" | chpasswd user${i}
    
    cp -rf /home/fenics/.jupyter /home/user${i}/.jupyter
    chown -R user${i} /home/user${i}/.jupyter
    chmod -R u+rX /home/user${i}/.jupyter
    
    cp -r /home/fenics/Installations/labs/Labs /home/user${i}/
    chown -R user${i} /home/user${i}/Labs
    chmod -R u+rX /home/user${i}/Labs
 
    echo "Created user${i} with password ${password}"
done

So we have to translate this into our Dockerfile.jupyterhub file.

# Copyright (c) Jupyter Development Team.
# Distributed under the terms of the Modified BSD License.
ARG JUPYTERHUB_VERSION
FROM jupyterhub/jupyterhub-onbuild:$JUPYTERHUB_VERSION

# Install dockerspawner, oauth, postgres
RUN /opt/conda/bin/conda install -yq psycopg2=2.7 && \
    /opt/conda/bin/conda clean -tipsy && \
    /opt/conda/bin/pip install --no-cache-dir \
        oauthenticator==0.8.* \
        dockerspawner==0.9.*

# We make additions here.
RUN useradd michael -m -s /bin/bash
RUN echo "michael:test_password" | chpasswd michael
RUN echo "Created user Michael"

# Copy TLS certificate and key
ENV SSL_CERT /srv/jupyterhub/secrets/jupyterhub.crt
ENV SSL_KEY /srv/jupyterhub/secrets/jupyterhub.key
COPY ./secrets/*.crt $SSL_CERT
COPY ./secrets/*.key $SSL_KEY
RUN chmod 700 /srv/jupyterhub/secrets && \
    chmod 600 /srv/jupyterhub/secrets/*

COPY ./userlist /srv/jupyterhub/userlist

Let’s try it?? The instructions say all I need now is to run make build in the root directory.

Failed. Commenting out lines referencing GitHub in Makefile (lines 23-24) and secrets/oauth.env (line 48).

I dont have any clue what is going on in the secrets files. Doing my best but unable to make it work.

Okay let’s try this.

docker run -p 8000:8000 -d --name jupyterhub jupyterhub/jupyterhub jupyterhub

My errors seem to be caused by outdated docker. Since I was running on 16.04, the apt repositories didn’t have newer versions, so I needed to manually add the updates:

curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable"
sudo apt-get update
apt-cache policy docker-ce
sudo apt-get install -y docker-ce

after setting ARG JUPYTERHUB_VERSION=0.90 in the header, I was able to finally run make build (but who knows how it will work).

There’s no particular reason Hub has to run in a docker container. It can still spawn up new processes with docker.

Once make build finished, I had to run docker-compose up -d

docker ps revealed two containers. One is a database of some sort (what was causing me trouble with installation earlier before I faked the files).

Logs are showing me that my problem is indeed with POSTGRES_PASSWORD … and this part is certainly a bit above my paygrade.