
Jeremy Leipzig, PhD
Bioinformatics engineer & Technical PM. Reproducible research, genomics, pipelines, and metadata. Architect and product leader for early stage therapeutic, diagnostic, and SaaS startups. O'Reilly author & PhD.
About Me
My career has dealt primarily with writing software to visualize, explore, and manipulate biological data. I have worked as a bioinformatics software engineer and product manager - in academia, industry, and in diagnostic, therapeutic and platform startups. During my roles in product development I have helped software companies understand and navigate the bioinformatics space. I have over 40 peer-reviewed publications, and an O’Reilly book. My university training was in Biology and Computer Science with an emphasis on bioinformatics and statistical genetics, and a PhD in information science with a dissertation on reproducible research.
With over 25 years of experience spanning academia and industry, I bring a unique perspective that combines deep technical expertise with product management and business strategy. My work encompasses everything from fundamental algorithm development to scaling production systems for enterprise clients.
Career Highlights
- 25+ years of experience spanning bioinformatics research, product management, and software development across academia and industry
- Led four diagnostic, therapeutic and SaaS startups in defining their bioinformatic product strategy
- BS in Bio, MS in CS, PhD in Information Science
- 40+ peer-reviewed publications (4 first authorships). h-index of 30.
- Expert in developing cloud-based pipelines for genomics, transcriptomics, and clinical applications
- Founder of PhillyR user group and top-20 contributor to biostars.org bioinformatics Q&A community
- Proven track record in scaling bioinformatics workflows from research prototypes to production systems
- Author of O'Reilly book 'Data Mashups in R' and multiple bioinformatics software tools
Experience
TileDB
Product Manager
Manage the population genomics product line, including product development, sales demos, customer support, and marketing.
- Develop analysis workflows for biopharma and hospital partners
- Lead product strategy for population genomics solutions
- Drive customer acquisition and technical sales processes
- Presented TileDB solutions at industry conferences and technical talks (watch presentation)
Truwl
Content Lead
Led onboarding of tools, workflows, and high-impact analyses into the Truwl platform.
- Developed benchmarking product and customer acquisition strategies
- Curated and validated bioinformatics workflows for the platform
- Established quality standards for computational reproducibility
- Presented platform capabilities and workflow demonstrations (watch demo)
Panorama Medicine
Bioinformatics Engineer
Developed cloud-based pipelines and analysis for drug repositioning efforts. First employee.
- Built scalable cloud infrastructure for drug discovery analytics
- Led data mining and competitive intelligence research initiatives
- Designed automated workflows for pharmacological data analysis
CytoVas LLC
Senior Bioinformatics Scientist
Scaled up flow cytometry workflows for lab developed tests.
- Developed novel statistical analyses for cell subpopulation measurement
- Analyzed extracellular vesicles in clinical trials and experimental assays
- Implemented quality control systems for diagnostic applications
Children's Hospital of Philadelphia (CHOP)
Senior Data Integration Analyst & GRIN Informatics Lead
Led bioinformatics core operations and developed tools for genomic variant analysis.
- Developed tools for mitochondrial and exome variant analysis
- Created ChIP-Seq and RNA-Seq reproducible reporting systems
- Built GRIN epilepsy analysis portal and Jupyter-based variant discovery platform
- Developed myBiC portal for bioinformatics report deliverables
- Led CHOP team in pediatric genomics consortium data management
DuPont Crop Genetics
Senior Research Associate
Developed bioinformatics tools for agricultural genomics and high-throughput screening.
- Built transcriptome assembly analysis and miRNA target scanning tools
- Developed LIMS systems for high throughput mutagenesis screens
- Implemented gene annotation pipelines for crop improvement programs
University of Pennsylvania - Bushman Lab
Bioinformatics Programmer
Developed bioinformatics pipeline for HIV integration site analysis.
- Created annotation and statistical analysis tools for HIV integration sites
- Analyzed microbial diversity and viral resistance mutations
- Published groundbreaking research on retroviral DNA integration patterns
NC State University - Dept. of Electrical Engineering
Web Developer
Developed various applications to manage student, employee, and equipment records.
- Built database management systems for academic records
- Created web interfaces for equipment tracking
- Implemented student and employee management applications
The Trout Group
Consultant
Developed scientific presentations used in investor road shows for biotechnology clients.
- Created compelling scientific narratives for biotech investor presentations
- Collaborated with Enchira and other biotechnology companies
- Translated complex scientific concepts for investment audiences
UNC School of Medicine - Duncan Lab
Research Technician
Responsible for all techniques involved in 2-deoxyglucose autoradiography studies of ketamine-induced psychotomimetic effects in rodents.
- Conducted neuropharmacology research on ketamine effects
- Performed 2-deoxyglucose autoradiography studies
- Analyzed psychotomimetic effects in rodent models
Selected Publications
Hierarchy‐guided neural network for species classification
Methods in Ecology and Evolution, 2021
Biodiversity Image Quality Metadata Augments Convolutional Neural Network Classification of Fish Species†
Research Conference on Metadata and Semantics Research, 2020
Computational Pipelines and Workflows in Bioinformatics
Reference Module in Life Sciences, Elsevier, 2018
Predicting the Pathogenicity of Novel Variants in Mitochondrial tRNA with MitoTIP
PLoS Computational Biology, 2017
Elevated frequency of damaging mt tRNA mutations in children with autism spectrum disorders
PLOS ONE, 2017
Phy-Mer: a novel alignment-free and reference-independent mitochondrial haplogroup classifier
Bioinformatics, 2014
The Mitochondrial Disease Sequence Data Resource (MSeqDR): a global grass-roots effort to promote sharing of mitochondrial DNA sequencing data
Mitochondrion, 2014
Increased frequency of de novo copy number variants in congenital heart disease by integrative analysis of single nucleotide polymorphism array and exome sequence data
Circulation Research, 2014
Integrated systems approach identifies genetic nodes and networks in late-onset Alzheimer's disease
Cell, 2013
MITOMAP and MITOMASTER: using the MITOMAP database to complement analysis of a novel mitochondrial DNA phenotype
Current Protocols in Bioinformatics, 2013
IRF1 and miR-146a-5p inhibit glioblastoma cell growth through IGF-1R downregulation
Oncogene, 2011
High-resolution human core-exome sequencing reveals a reduced penetrance of CFH and CFHR5 mutations in familial macular degeneration
Human Genetics, 2008
HIV integration site selection: analysis by massively parallel pyrosequencing reveals association with epigenetic modifications
Genome Research, 2007
A genome-wide association study of HIV drug resistance
AIDS Research and Human Retroviruses, 2007
Selection of target sites for mobile DNA integration in the human genome
PLoS Computational Biology, 2006
Genome-wide analysis of chromosomal features repressing human immunodeficiency virus transcription
Journal of Virology, 2006
Host cell factors in HIV replication: meta-analysis of genome-wide studies
Nature Reviews Microbiology, 2005
Integration targeting by avian sarcoma-leukosis virus and human immunodeficiency virus in the chicken genome
Journal of Virology, 2005
The Alternative Splicing Gallery (ASG): bridging the gap between genome and transcriptome
Nucleic Acids Research, 2004
The alternative splicing gallery (ASG): bridging the gap between genome and transcriptome
Nucleic Acids Research, 2004
Effects of chronic administration of selected atypical antipsychotics on monoamine levels in rat striatum
Neuropharmacology, 2000
Differential effects of clozapine and haloperidol on ketamine-induced brain metabolic activation
Brain Research, 1999
Repeated administration of haloperidol, risperidone, or olanzapine to rats does not produce the pattern of metabolic changes between brain regions found in subjects with schizophrenia
Neuropsychopharmacology, 1998
* ISI Highly Cited
† Best Research Paper: 14th International Conference on Metadata and Semantics Research
Education
Drexel University
PhD in Information Science
Dissertation on reproducible research and metadata in bioinformatics
North Carolina State University
Master of Computer Science
Focus on statistical genetics and alternative splicing
Wake Forest University
Bachelor of Science in Biology
Research experience in neurobiology
Research Interests
Reproducible Research & Metadata
Developing frameworks and tools to ensure computational reproducibility in biological research, with focus on metadata standards and pipeline documentation
Genomic Variant Analysis
Creating tools for mitochondrial genetics, exome analysis, and clinical genomics applications with emphasis on rare disease diagnostics
Cloud-Scale Bioinformatics
Building scalable workflows and platforms for population genomics, drug discovery, and precision medicine using AWS and GCP infrastructure
Bioinformatics Product Strategy
Translating research methodologies into commercial bioinformatics products, from startup strategy to enterprise solutions
Technical Skills
Programming Languages
Workflow Systems
Cloud Platforms
Bioinformatics Tools
Data Analysis
Web & Database
Notable Software & Tools
MITOMASTER
Web application that allows clinicians to quickly investigate mitochondrial mutations in sequenced or genotyped samples. Used worldwide for mitochondrial genetics research.
View ProjectmyBiC
Django application that manages user authentication and presentation of bioinformatics deliverables. Streamlines report delivery for core facilities.
View ProjectInSiPiD
Integration Site Pipeline and Database - comprehensive toolset for managing viral integration site data processing, annotation, and analysis.
placenta
Comprehensive analysis pipeline and resources for placental genomics research and developmental studies.
View Projectopensnp
Validation of published GWAS studies using OpenSNP volunteered data. Binderized for reproducible analysis.
View Projectm6a
Analysis tools and workflows for N6-methyladenosine (m6A) RNA modification detection and quantification.
View Projectmetadata-in-rcr
Resources and examples for metadata standards in reproducible computational research. Supporting materials for academic publications.
View Projectfcs_flow_cytometry
Core flow cytometry data analysis tools and utilities for FCS file processing.
View Projectfcs_utils
Utility functions and helper tools for flow cytometry data manipulation and quality control.
View Projectmcmanus_ant1
Analysis pipeline for mitochondrial ANT1 gene studies and associated metabolic pathways.
View Projectasciiruler
Command-line tool for generating ASCII rulers and measurement guides for sequence analysis.
View Projectstandard-velvet-assembly-report
Standardized reporting pipeline for Velvet genome assembly results and quality metrics.
View Projectawesome-reproducible-research
A curated list of reproducible research case studies, projects, tutorials, and media. Community-driven resource with 356+ stars.
View ProjectSandwichesWithSnakemake
Beginner's tutorial to Snakemake workflow management system. Popular educational resource with 70+ stars.
View Projectsnakemake-example
RNA-Seq Snakemake example with Jekyll homepage creation. Complete workflow demonstration with 20+ stars.
View Projectberrylogo
A better seqLogo implementation for creating sequence logos in R. Enhanced visualization tool with 12+ stars.
View Projectancestryinformativemarkers
Public Ancestry Informative Markers (AIMs) dataset and tools for population genetics analysis.
View Projectblast-wrapper
Node.js web-service wrapper for fetching pairwise BLAST alignments against a fixed reference.
View Projectontopop
Analysis tool to measure popularity and usage of biological ontologies across scientific literature.
View Project