Published Images and Packages#
Within Cal-ITP, we publish several Python packages and Docker images that are then used to underpin other work. Changes to these packages and images are deployed via CI/CD processes that run automatically when new code is merged to the relevant Cal-ITP repository. CI/CD processes for Python packages run a test upload to PyPI upon opening or modifying a pull request. CI/CD processes for images build the relevant image upon opening or modifying a pull request, but do not push a new image tag to GHCR until changed are merged into main.
Images and packages manage dependencies via pyproject.toml files. The jupyter-singleuser image uses uv for dependency management; analysts’ data science dependencies are managed in the data-analyses repo as a uv workspace and installed at runtime via uv sync.
READMEs describing the individual testing and publication process for each image and package are linked in the below table. A detailed guide for updating the calitp-data-analysis package is available here, written for an analyst audience.
Name |
Function |
Source Code |
README |
Publication URL |
Type |
|---|---|---|---|---|---|
calitp-data-analysis |
Shared tools to ease common data analysis tasks within the Cal-ITP ecosystem. Now lives in data-analyses as a uv workspace member. |
Python Package |
|||
calitp-data-infra |
Shared imports and tools used by data infrastructure and data pipelines within the Cal-ITP ecosystem |
Python Package |
|||
dask |
Parallelization infrastructure used by JupyerHub |
Docker Image |
|||
gtfs-rt-archiver-v3 |
Underpins our GTFS-RT archiver service, allowing us to save fast-moving GTFS-RT data |
Docker Image |
|||
gtfs-schedule-validator |
Wrapper for the MobilityData GTFS Schedule validator (so we can choose to use the correct version for the age of a given data import) |
Docker Image |
|||
jupyter-singleuser |
Shared, consistent tooling for individual local Jupyter notebook users and JupyterHub |
Docker Image |