Linear Runtimes For Quasimapping And Alignment Free

1 minute read

Published: October 07, 2024

I want to discuss linear runtimes and what that means in alignment-free methods for bioinformatics and sequence alignments and quasi alignments. First, it is the splitting of the sequences, as they are read, into, let’s say, ‘a’ De Bruijn graph. This graph consists of the k-mers, their neighborhoods, and of course the walks or paths through the graph that constitute optimal criteria and local maxima of course for traversal and contig/walk/path maximization. Typically, a search through the De Bruijn structure may be Breadth-First to find optimal depths for traversal of the path through the De Bruijn structure, optimizing for creating some sequence. This leads to read collapse along the sequence unidirectionally (bidirectionally in a unidimensional space) along the sequence space.

The linear runtime is observed thanks to the profile being generated a priori, and then passed to aggregation and calculating functions, which only require the count vector, until the concept of the matrix in linear runtime and polynomial runtimes of matrix operations. Linear runtime being perfect maps or transformations on the data of interest. The matrices transformed with linear mappings, such as through creating distance functions, tend to have linear or better bounds, possibly constant. Polynomial runtimes include regression, least squares, and other matrix-multiplication requiring algorithms.

The linear runtime is observed by virtue of a single dimension being created at once from the inputs. The linear runtime is by virtue of the single De Bruijn graph mapping itself to some feature dimension during calculations or functions mapping the puzzle pieces to their associated locations in a transcriptomic or genomic fasta feature space.

Share on

Twitter Facebook LinkedIn

Thoughts on ML deployments, containerized workflows, and notebooks.

6 minute read

Published: February 06, 2025

This article hopes to bring the reader up to date (ca. 2017-2022) on modern cloud-native and scalable solutions for data science and natural science research application stacks using the Docker container standard for container specification (vs Singularity, Podman, or containerd containers that are equally valid). First I will provide a brief description of the goal of Docker containers. Next I’ll touch on the kubernetes architecture for distributed data processing and application service management. Finally, I’ll describe code repository, container registries, and Markdown/Rmarkdown/LaTeX documentation as it purtains to a service’s lifespan w.r.t. notebooks and documentation of custom services and their orchestration.

Matt Ralston

Linear Runtimes For Quasimapping And Alignment Free

Share on

You May Also Enjoy

Migrant Workers Playbook

Migrant Workers Playbook Es

Debunking Coronavirus Conspiracy Theories

Thoughts on ML deployments, containerized workflows, and notebooks.