Thoughts on ML deployments, containerized workflows, and notebooks.
Published:
This article hopes to bring the reader up to date (ca. 2017-2022) on modern cloud-native and scalable solutions for data science and natural science research application stacks using the Docker container standard for container specification (vs Singularity, Podman, or containerd
containers that are equally valid). First I will provide a brief description of the goal of Docker containers. Next I’ll touch on the kubernetes architecture for distributed data processing and application service management. Finally, I’ll describe code repository, container registries, and Markdown/Rmarkdown/LaTeX documentation as it purtains to a service’s lifespan w.r.t. notebooks and documentation of custom services and their orchestration.