Recent Posts

Hypothesis-driven feature engineering

5 minute read

I’m starting a machine learning project to use microbial communities as biosensors for redox potential. Why is this important? Well, redox potential (written...

Sustainable data preprocessing with pipelines

11 minute read

Early in my modeling of the Titanic dataset (a kind of “Hello World” for machine learning), I was struck by the variety - or inconsistency - of data preproce...