Where Science Meets AI Innovation

I'm a machine learning engineer with deep academic research roots.
My passion lies in creating AI that's trustworthy and adaptable—
systems that learn from data and solve real problems.

The problem I solve: I’ve spent over a decade turning complex research questions into elegant software solutions. I’m always on the lookout for opportunities to pursue my passion for building AI systems that are dependable, adaptable, and impactful.

My unique perspective combines rigorous scientific methodology with modern ML engineering practices—creating solutions that researchers and industry professionals can trust with their most important data.

My mission: Transform cutting-edge research into production-ready AI systems that researchers, organizations, and communities can actually use and trust.

Currently: Developing cloud-based apps and intelligent AI tools to accelerate scientific research teams.


AI4citations: Combating scientific misinformation with AI

AI4citations banner

The challenge: Researchers spend countless hours manually verifying citations, while misinformation spreads through unchecked claims.

The solution: Shuffled training across multiple datasets lays the groundwork for breakthrough improvements in citation verification.

Impact & innovation:

  • pyvers package (based on PyTorch Lightning) automates preprocessing and training on claim verification datasets
  • 7% improvement in F1 score over state-of-the-art models through shuffled training methodology
  • Real-time web app with continuous feedback collection for model improvement
  • Production-ready deployment with CI/CD pipeline ensuring reliability with every update

This project showcases my ability to take research from concept to production, demonstrating skills in model optimization, deployment, and building systems that improve over time.


R-help chat: Making knowledge accessible through conversational AI

R-help chatbot banner

The challenge: Decades of valuable programming discussions buried in email archives, difficult to search effectively.

My innovation: A RAG-powered chatbot that transforms static archives into interactive knowledge discovery.

Technical achievements:

  • Local models for improved privacy and cost reduction vs OpenAI
  • 10% accuracy improvement through hybrid dense+sparse retrieval
  • LangGraph implementation with source citations for trustworthy responses
  • Multi-turn conversational interface for complex technical queries

This project demonstrates my expertise in modern NLP architectures, cost-effective AI deployment, and creating user experiences that unlock hidden value in existing data.


CHNOSZ: Building scientific infrastructure that lasts

CHNOSZ banner

The vision: Scientific software that researchers worldwide can depend on for years to come.

1.5 decades of impact:

  • Maintained on CRAN since 2009
  • 200+ citations from researchers worldwide
  • 90% test coverage for smoother development and back-compatibility
  • Active community supported through GitHub Discussions

Architecture for longevity:

  • Extensible API supporting third-party integrations (Shiny frontend, Python interface)
  • Comprehensive documentation ecosystem (help pages, examples, demos, vignettes)
  • Automated data consistency checks to catch common data entry errors

This isn’t just software—it’s infrastructure that enables scientific discovery. The longevity and reliability demonstrate my commitment to building systems that stand the test of time.


🚧 Projects In Development: Pushing AI Boundaries 🚧

Statistical AI Agents: Autonomous data analysis

Building AI agents that can independently perform statistical analysis and generate insights

Docker Microservices for Science: Scalable computing architecture

Containerized scientific computing services for cloud-native research workflows


💡 What Sets Me Apart

I don’t just implement algorithms—I solve meaningful problems:

🎯 Problem-first thinking: Academic training taught me to ask the right questions before building solutions.
🛠️ Production-ready mindset: 15+ years maintaining production software means I build for reliability, scalability, and long-term sustainability from day one.
🤝 Community builder: Successfully grew and maintained global research communities. I understand that great AI systems require great user experiences and ongoing support.
📊 Data storyteller: Authored 25+ peer-reviewed papers requiring clear communication of complex technical concepts to diverse audiences.
🔄 Continuous learner: From R packages to PyTorch models to LangChain applications—I adapt to new technologies while maintaining deep expertise.

🔧 Core Technical Skills

AI & machine learning: PyTorch • scikit-learn • NLP • Large Language Models • Fine-tuning • RAG Systems
MLOps & production: Docker • AWS • CI/CD • Testing • Monitoring • Model Deployment • Hugging Face
Data engineering: Python • SQL • R • Data Pipelines • Multi-source Integration • Quality Validation
Development: Git • Linux • Shell • Jupyter • API Design • Open Source Development

🎓 Academic Foundation Meets Industry Innovation

My academic background isn’t just about degrees—it’s about transferable skills that make me a stronger ML engineer:

🔬 Research methodology: Hypothesis formation, experimental design, and rigorous evaluation
📝 Technical communication: Translating complex concepts for diverse stakeholders
🏆 Project leadership: Managing long-term projects from conception to community adoption
🌍 Global collaboration: Working with international teams across time zones and cultures
⚡ Innovation under constraints: Creating solutions with limited resources and high quality standards


Let’s build AI that doesn’t just work today—but works reliably for years to come.