Moonlabs Academy · learn vision AI

Vision AI. The part that reads documents.

Moonlabs is the operator-led AI Academy in Derby. We run three live companies — Homemove, home.co.uk and homedata.co.uk — and we teach twelve students per cohort to ship a real AI product, sell it to a real customer, and raise on it. Three pillars: Coding, Commercials, Investment. Twelve weeks. £6,000.

Moonlabs is what we are. Two operators — James Freestone and Louis O’Connell-Bristow — who run Homemove, home.co.uk and homedata.co.uk. Property runs on documents and photographs — floorplans, EPCs, surveys, photo listings — so vision AI is a production surface for us, not a side experiment. The structured-extraction patterns on this page are what we tune against real listings every week.

The Academy is what we do. A twelve-week, in-person, twelve-student cohort in Derby. You build a real AI product. You sign a paid pilot on it. You write a deck and a financial model. You leave with a deployed system, a paying customer reference and a live investor pipeline. Coding, Commercials, Investment — the three pillars taught in equal weight every week.

Why this page exists. Vision AI in 2026 is genuinely useful in a way it was not in 2024. Claude with vision, GPT with images, Gemini multimodal, the open-weight visual stacks — can read documents, reason over screenshots, extract structured data from photographs, watch a workflow being demonstrated and replicate it. The bar moved more in twelve months than it did in the previous decade, and almost no course teaches it at the level production teams actually deploy. You leave the Academy with a visual system in production — or as the founder of a vertical document-AI product whose moat is the messy edge cases.

Apply for the next cohort See the curriculum

Coding · vision AI at production fluency

Document understanding pipelines for messy real-world inputs (receipts, forms, contracts, technical drawings, screenshots). Structured extraction with JSON-schema validators. Multi-image agent flows. Hybrid LLM + classical OCR where each earns its place. Ground-truth evals and regression suites that survive a model release. A deployed vision-AI system by week twelve.

Commercials · selling document-AI into back offices

Insurance claims, conveyancing, accounts payable, supplier invoices, building-survey reports — every back office in the country has a document pile a vision pipeline could halve. Pricing per document processed or per seat replaced, the discovery call, a one-page pilot agreement. A paid pilot by week six — one of the most concrete-ROI AI offers an SMB will sign.

Investment Â· raising on vision/document-AI

Nanonets, Hyperscience ($500m+), Rossum, Klippa, Eigen Technologies (UK, $25m), Reducto raising on document-AI, Reka multimodal, Snowflake’s acquisition of Modulus, Pix4D, Tractable ($1bn on insurance-claim vision) — document and vision-AI is one of the most heavily funded vertical-AI categories in venture. Cap table, ten-slide deck, financial model. A live investor pipeline by demo day.

FAQ

Common questions.

Do I need ML / computer-vision experience?

No. Almost no useful vision AI work in 2026 trains models from scratch — the work is engineering on top of frontier multimodal APIs. The classical CV background is a nice-to-have, not a requirement.

What providers do you cover?

Anthropic (Claude vision), OpenAI (GPT image), Google (Gemini), and open-weight options where they earn their place. The course is deliberately provider-agnostic.

Will I learn classical OCR or pure-LLM document AI?

Both, where they earn their place. Classical OCR is still the right answer for high-volume, narrow-document workloads; LLM-based document understanding wins on edge-case-heavy inputs. We teach the judgement, not the dogma.

What kinds of projects do students build?

Past projects have included contract retrieval over photographed pages, dimension extraction from technical drawings, visual support triage for an e-commerce returns flow, and a screenshot-watching agent that automates a quarterly compliance check. The pattern is “a vision system real users would pay to run”.

How does this fit with the wider LLM and agents curriculum?

Vision is one delivery surface for the same underlying agent architecture. The agents course covers the architecture; this page covers the visual-specific application of it.

Build AI agents course

The wider agent-architecture framing. Vision is one delivery surface; agents are the underlying pattern.

Learn voice AI

The other multimodal-delivery sister page. Both share the production discipline.

Other ways in

More Academy entry points.

The Academy is one course with many doors. Each of these pages is a different entry point into the same twelve weeks.

Skills · tools · stack

Build it. Sell it. Raise on it. In twelve weeks.

Tell us what visual system you would build and which document pile or photo archive it would clear. James and Louis read every application personally and reply inside the week.

Apply to the Academy Academy programmes

Incubator Academy AI for business Field Notes Glossary AI in Derby For parents For employers For investors Privacy Terms Sitemap founders@homemove.com