# Published Images and Packages

Within Cal-ITP, we publish several Python packages and Docker images that are then used to underpin other work. Changes to these packages and images are deployed via CI/CD processes that run automatically when new code is merged to the relevant Cal-ITP repository. CI/CD processes for Python packages run a test upload to PyPI upon opening or modifying a pull request. CI/CD processes for images build the relevant image upon opening or modifying a pull request, but do not push a new image tag to GHCR until changed are merged into `main`.

Some images and packages manage dependencies via traditional requirements.txt files, and some manage dependencies via [Poetry `pyproject.toml` files](https://python-poetry.org/docs/pyproject/). Please refer to Poetry documentation for successful management of pyproject dependencies.

READMEs describing the individual testing and publication process for each image and package are linked in the below table. A detailed guide for updating the calitp-data-analysis package is available [here](https://docs.calitp.org/data-infra/analytics_tools/python_libraries.html#updating-calitp-data-analysis), written for an analyst audience.

| Name                    | Function                                                                                                                              | Source Code                                                                                      | README                                                                        | Publication URL                                            | Type           |
| ----------------------- | ------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------ | ----------------------------------------------------------------------------- | ---------------------------------------------------------- | -------------- |
| calitp-data-analysis    | Shared tools to ease common data analysis tasks within the Cal-ITP ecosystem                                                          | https://github.com/cal-itp/data-infra/tree/main/packages/calitp-data-analysis                    | https://github.com/cal-itp/data-infra/tree/main/packages/calitp-data-analysis | https://test.pypi.org/project/calitp-data-analysis         | Python Package |
| calitp-data-infra       | Shared imports and tools used by data infrastructure and data pipelines within the Cal-ITP ecosystem                                  | https://github.com/cal-itp/data-infra/tree/main/packages/calitp-data-infra                       | https://github.com/cal-itp/data-infra/tree/main/packages/calitp-data-infra    | https://test.pypi.org/project/calitp-data-infra            | Python Package |
| dask                    | Parallelization infrastructure used by JupyerHub                                                                                      | https://github.com/cal-itp/data-infra/tree/main/images/dask                                      | https://github.com/cal-itp/data-infra/tree/main/images/dask                   | https://ghcr.io/cal-itp/data-infra/dask                    | Docker Image   |
| gtfs-rt-archiver-v3     | Underpins our GTFS-RT archiver service, allowing us to save fast-moving GTFS-RT data                                                  | https://github.com/cal-itp/data-infra/tree/main/services/gtfs-rt-archiver-v3/gtfs_rt_archiver_v3 | https://github.com/cal-itp/data-infra/tree/main/services/gtfs-rt-archiver-v3  | https://ghcr.io/cal-itp/data-infra/gtfs-rt-archiver-v3     | Docker Image   |
| gtfs-schedule-validator | Wrapper for the MobilityData GTFS Schedule validator (so we can choose to use the correct version for the age of a given data import) | https://github.com/cal-itp/data-infra/tree/main/jobs/gtfs-schedule-validator                     | https://github.com/cal-itp/data-infra/tree/main/jobs                          | https://ghcr.io/cal-itp/data-infra/gtfs-schedule-validator | Docker Image   |
| jupyter-singleuser      | Shared, consistent tooling for individual local Jupyter notebook users and JupyterHub                                                 | https://github.com/cal-itp/data-infra/tree/main/images/jupyter-singleuser                        | https://github.com/cal-itp/data-infra/tree/main/images/jupyter-singleuser     | https://ghcr.io/cal-itp/data-infra/jupyter-singleuser      | Docker Image   |
