I am currently working on tuning and benchmarking an algorithm for processing k-mer spectra under GPLv3+. The project is detailed further under my Portfolio Showcase. The project features theoretical investigations, code profiling, report generation, and Github project management.
- CLI Performance Tuning
- K-mer Database (.kdb)
- Performance Benchmarking (Rmd)
- NGS Complexity Index
- Customized Arch Linux OS
In the past, I worked for Terry Papoutsakis, a metabolic engineer of the bacterium C. acetobutylicum and the biopharmaceutical company Bristol-Myers Squibb in two departments: Computational Genomics and Computer Assisted Drug Design
- Machine Learning
Organic, Phyical, Analytical Chemistry:
- Fed-batch fermentation
- Plate/colony culture
- Flux-balance analysis
- Fluorescence-Activated Cell Sorting (FACS)
- Immunofluorescence / Immunohistochemistry
- Western blot
- Northern blot
- Next Generation Sequencing
- Organic, Phyical, Analytical Chemistry:
- Sqlalchemy/Rails/Express ORMs
Statistics / Visualization:
- R + KnitR + Shiny
- Data-Driven Documents (D3.js)
- HTML + CSS + JS
- Bash on RHEL/Debian/Arch Linux/OSX
Scripting / Full-stack:
- Python / Flask
- Ruby / Rails
- NodeJS / Express
- B.S. Biochemistry
- M.S. Bioinformatics
Bioinformatician who enjoys studying microbiological genomics and algorithms. Not currently enrolled in a program but still studying the disciplines nonetheless. I look for disciplines that are synergistic with biological fundamentals (chemistry, compuster science, and mathematics) to balance potential applied science areas with my quantitative reasoning.
Philosophically, I admire the spirit of the Gnu Public License, Creative Commons, Arduino, and open-science communities. Some day, I'd love to work on some blog posts about citizen science with Arduino instruments.
A challenge in biological sciences is system complexity and the number of system components available to study. For this reason, there has been an increase in the number of multiplexed measurement technologies, as well as an increase in the cost of instrumentation. Multiplexed assay formats like the microarray and Illumina sequencing provide a broad, survey arm to detect changes worth investigating at the classical level. However, determining the sensitivity of such measurements or confronting the quantitative aspects of biology remains a challenge that is often addressed by software and statistical thresholding.
Quantitative fundamentals are often pushed aside in exchange for a broad applied science survey in many undergraduate biology programs. For this reason, I elected to focus on classical biochemistry book work, applied molecular fundamentals in a cancer laboratory, and a masters degree in computer science with a focus on sequencing technology and multiplexed gene expression.
If there was one thing I'd fix from my masters degree, it would be that I didn't build the library from scratch with cheap ligases and random hexamers. I elected instead to use a Illumina TruSeq sequencing kit, which produced an RNA library with good fragment length and excellent fastq quality scores. However, these kits should be used as a high-caliber basis to judge the quality of alternative library preparation protocols utilizing cheaper reagents and custom barcoding strategies.
Open source fanboy, gamer, guitarist.
Feel free to contact me with questions, requests, or feedback