Data engineer in food manufacturing. I build pipelines, compliance dashboards, and local AI tools for factories where data can't leave the building.
Things I built that are running in production or deployed publicly.
On-premises AI assistant for food manufacturing. Ask your database questions in English — the LLM generates SQL, executes it, and explains the result. Upload SOPs and search them with RAG. Runs entirely on Ollama, no data leaves the network.
BRC/HACCP food safety compliance. Trace any batch from raw material to customer despatch in seconds. Temperature anomaly detection with rolling z-scores. Allergen matrix. One-click PDF audit reports. Built by someone on the factory floor.
End-to-end data pipeline. Ingests live crime data from the Police UK API, transforms with dbt (staging/marts, 53 tests), orchestrates with Prefect, serves a public Streamlit dashboard. 3 GitHub Actions workflows: CI, weekly auto-ingest, daily health checks.
GitHub Action that auto-reviews .sql files in pull requests using local AI. Catches injection risks, performance anti-patterns, style violations. Posts structured review comments with fix suggestions. One YAML file to set up.
Health Q&A platform for factory workers. NHS-verified guidance, Gemini AI responses, voice input, 18 languages. Flask + PostgreSQL, Dockerised.
ML analysis of UK A-Level attainment gaps across ethnicity, gender, and deprivation. Feature importance with XGBoost.
I fix bugs and ship features in the tools I depend on. 11 projects, 200K+ combined stars.
| Project | Contribution |
|---|---|
| vllm 75K | Improved DCP/PCP error messages with actionable backend guidance |
| Apache Superset 65K | Renamed supersetCanCSV → supersetCanDownload across frontend |
| pandas 45K | Clarified str.cat() return type docs for Index |
| ChromaDB 18K | Version compat check + 220-line HNSW tuning guide |
| Plotly 17K | Dependabot config for uv.lock |
| dbt-core 10K | Removed unnecessary profiler context manager arg |
| dlt 7K | Migrated flake8 config to ruff |
| ollama-python 5K | Added exists() method + fixed ShowResponse ValidationError |
| drt | Snowflake connector (290 lines, tests) + Dockerfile + pre-commit hooks — merged |
| fpdf2 1K | Fixed TextRegion.ln() double line break |
Tools I use daily.