Skip to main content

Custom configurations and overrides

Custom settings​

Terraform state​

Terraform manages the state of all the deployed resources via backends. Terraform requires storing the state in order to keep track of the names, ids, and states of deployed resources. See terraform remote state for more information.

If you are doing anything other than testing we highly recommend remote unless you know what you are doing. As any unexpected changes to the state can cause issues with the deployment.

For a existing provider that deploys to a kubernetes cluster, the kubernetes remote backend is also used.

terraform_state:
type: remote

Default Images​

Nebari uses Docker images to provide containers for JupyterHub, JupyterLab interface, and the Dask worker user environments.

Default images are the image run by default if not explicitly specified in a profile (described in the next section).

The jupyterhub key controls the JupyterHub image run.

These control the docker image used to run JupyterHub, the default JupyterLab image, and the default Dask worker image respectively.

### Default images ###
default_images:
jupyterhub: "quay.io/nebari/nebari-jupyterhub:v2022.10.1"
jupyterlab: "quay.io/nebari/nebari-jupyterlab:v2022.10.1"
dask_worker: "quay.io/nebari/nebari-dask-worker:v2022.10.1"

Storage​

The Storage section is used to control the amount of storage allocated to the shared filesystem.

### Storage ###
storage:
conda_store: 200Gi
shared_filesystem: 200Gi
ðŸ”Ĩwarning

For most providers, when the storage size is changed, it will automatically delete the previous storage place.

Changing the storage size on an AWS deployment after the initial deployment can be especially tricky so it might be worthwhile padding these storage sizes.

Overrides​

Overrides allows you to override the default configuration for a given resource on Nebari without having to directly modify the infrastructure components.

Below we show the available resources that can be overridden in the configuration.

JupyterHub​

JupyterHub uses the zero to jupyterhub helm chart. This chart has many options that are not configured in the Nebari default installation. You can override specific values in the values.yaml. jupyterhub.overrides is optional.

jupyterhub:
overrides:
cull:
users: true

JupyterLab​

Nebari supports a number of configuration options for JupyterLab:

  • jupyterlab.idle_culler - This is used to configure the idle culler for JupyterLab. See idle culling for more information.
jupyterlab:
idle_culler:
kernel_cull_idle_timeout: 30
  • jupyterlab.initial_repositories - Auto-deploys specified git repositories by cloning them into user directories upon JupyterLab instance initialization. Accepts a list of name: url pairs, with each name becoming the folder name in the user's home directory. Once the repository is cloned in the user space, it will not be updated. In order to update to the latest repository, users must delete the folder and restart the server.
jupyterlab:
initial_repositories:
- examples/nebari-demo: https://github.com/nebari-dev/nebari-demo.git
â„đïļNote

Path location key should not start or end with trailing slash. You can configure JupyterLab to open in a location within the cloned repository by setting preferred_dir option within the jupyterlab group.

See also jupyterlab.gallery_settings (documented below) which defers the cloning of repositories until user requests it and provides a rich presentation layer.

ðŸ”Ĩwarning

While you could embed an access token in the URL to fetch from a private repository, please beware that this token can be accessed by each user - you should only use tightly scoped personal access tokens which you are comfortable to share with each of your users.

  • jupyterlab.default_settings - Enables overriding the default JupyterLab and JupyterLab extensions settings. Users will still be able to adjust the settings in the JupyterLab Setting Editor. The keys should be names of the Jupyter plugins with values defining mapping between the plugin setting and new default.
jupyterlab:
default_settings:
"@jupyterlab/apputils-extension:themes":
theme: JupyterLab Dark
  • jupyterlab.preferred_dir - Sets the default location in which JupyterLab should open the file browser in.

  • jupyterlab.gallery_settings - Configures jupyterlab-gallery extension which enables user to clone (and later synchronise) pre-specified repositories.

jupyterlab:
gallery_settings:
title: Example repositories
destination: examples
exhibits:
- title: Nebari
git: https://github.com/nebari-dev/nebari.git
homepage: https://github.com/nebari-dev/nebari
description: ðŸŠī Nebari - your open source data science platform
- title: PyTorch Tutorial
git: https://github.com/your-org/your-repository.git
account: name_of_dedicated_account
token: YOUR_PERSONAL_ACCESS_TOKEN
icon: https://your.domain/icon.svg
ðŸ”Ĩwarning

While private repositories can be cloned by passing account and token, the access token can be accessed by each user - you should only use tightly scoped personal access tokens which you are comfortable to share with each of your users.

Terraform​

The Nebari configuration file provides a huge number of configuration options for customizing your Nebari infrastructure. While these options are sufficient for an average user, they aren't exhaustive by any means. There are still a plenty of things you might want to achieve which cannot be configured directly by the above mentioned options. Therefore, we've introduced a new option called terraform overrides (terraform_overrides), which lets you override the values of terraform variables in specific modules/resource. This is a relatively advanced feature and must be used with utmost care, and you should really know what you're doing.

Here we describe the overrides supported via Nebari config file:

Ingress​

You can configure the IP of the load balancer and add annotations for the same via ingress's terraform overrides, one such example for GCP is:

ingress:
terraform_overrides:
load-balancer-annotations:
"networking.gke.io/load-balancer-type": "Internal"
"networking.gke.io/internal-load-balancer-subnet": "pre-existing-subnet"
load-balancer-ip: "1.2.3.4"

This is quite useful for pinning the IP Address of the load balancer.

Deployment inside a Virtual Private Network​

You can deploy your cluster into a Virtual Private Network (VPN) or Virtual Network (VNET).

An example configuration for Azure is given below:

azure:
region: Central US
...
vnet_subnet_id: '/subscriptions/<subscription_id>/resourceGroups/<resource_group>/providers/Microsoft.Network/virtualNetworks/<vnet-name>/subnets/<subnet-name>'

If you want the AKS cluster to be private cluster.

For extra security, you can deploy your cluster from an Azure Bastion host (or jump host), making the Kubernetes API only accessible from this one secure machine. You will likely need to also modify the network_profile as follows:

azure:
region: Central US
private_cluster_enabled: true
vnet_subnet_id: '/subscriptions/<subscription_id>/resourceGroups/<resource_group>/providers/Microsoft.Network/virtualNetworks/<vnet-name>/subnets/<subnet-name>'
network_profile:
service_cidr: "10.0.2.0/24" # how many IPs would you like to reserve for Nebari
network_plugin: "azure"
network_policy: "azure"
dns_service_ip: "10.0.2.10" # must be within the `service_cidr` range from above
docker_bridge_cidr: "172.17.0.1/16" # no real need to change this

Deployment inside a virtual network is slightly different from deploying inside a public network. As the name suggests, since it's a virtual private network, you need to be inside the network to able to deploy and access Nebari. One way to achieve this is by creating a Virtual Machine (VM) inside the virtual network. Select the virtual network and subnet name under the networking settings of your cloud provider while creating the VM and then follow the usual deployment instructions as you would deploy from your local machine.

Conda store worker​

You can use the following settings to change the defaults settings (shown) used for Conda store workers.

conda_store:
max_workers: 50
worker_resources:
requests:
cpu: 1
memory: 4Gi
â„đïļNote

Current conda_store.worker_resources defaults are set at the minimum recommended resources for conda-store-workers - (conda-store docs)