This article is part of our AI Photography Insights collection.
Running an AI headshot product at scale is not a model problem. We had good models. It's an infrastructure problem — and for a long time, that infrastructure was quietly killing us.
GPU jobs failing silently. Customers paying for professional headshots and waiting a full day. Our engineering team spending entire sprints debugging providers instead of improving the product. Gross margins sitting at around 40% with no clear path up.
That changed when we started working with Runflow.
BetterPic processes hundreds of thousands of AI inference jobs every month — portrait generation, background removal, quality scoring, clothing swap, and more. Each job touches multiple models, requires specific GPU memory, and needs to return a result fast enough that the customer doesn't notice it happened.
The reality before Runflow: we had no retry logic. If a GPU job failed — which happened constantly due to quota limits, provider outages, or memory errors — it simply didn't complete. Customers opened support tickets. We opened dashboards. Time passed.
"Customers were paying for AI headshots and waiting a full day. That's not a cost problem, that's a product crisis."
— Thibaut Hennau, CEO, BetterPic
We faced an uncomfortable choice: hire a dedicated ML infrastructure and DevOps team — expensive, slow, and a distraction from our core product — or find a platform that had already solved this problem from the inside out.
Runflow is a ComfyUI deployment and orchestration platform built specifically for production AI workloads. The short version: we call an endpoint, and everything underneath — GPU routing, retry logic, quality checks, failover across datacenters — happens automatically.
Today, BetterPic runs more than 10 distinct AI inference pipelines through Runflow. Here's what that looks like in practice:
The part that changed our engineering velocity most wasn't any single feature — it was the things we stopped having to think about.
When a GPU job fails, Runflow retries automatically and routes across providers and datacenters. GPU quota limits and availability issues that used to cause outages are now invisible to us. Customer wait times collapsed. Support tickets dropped.
Runflow's Sentinel layer evaluates every output before delivery. It's not checking whether the job completed — it's checking whether the result is actually good. Eight specialized evaluation passes run per generation: face similarity, segmentation, pose analysis, and LLM-based judges for identity, garment fit, skin realism, and more. If an output fails, it's retried. If the retry passes, it ships. If it doesn't, we know about it before the customer does.
This is the only quality evaluation system of its kind outside of Google's Vertex AI enterprise platform — and it's available to us through a single configuration toggle.
Our team builds and iterates workflows in ComfyUI. When a workflow is ready, it deploys to Runflow's infrastructure directly from inside ComfyUI — no file uploads, no manual configuration, no DevOps ceremony. The endpoint is live with typed inputs, auto-generated API docs, and auto-scaling GPU already configured.
# One endpoint call from our application layer
curl -X POST https://api.runflow.io/v1/flows/headshot/run \
-H "Authorization: Bearer rf_..." \
-H "Content-Type: application/json" \
-d '{"images": ["selfie_1.jpg", "selfie_2.jpg"],
"style": "professional",
"quality": "premium"}'
Our developers don't need to understand GPU infrastructure, model quantization, or provider-level scheduling. They call an endpoint. Runflow handles the rest.
Runflow runs on L40S GPUs at $1.95 per hour — 44% cheaper than the market rate of $3.51/hr, billed by the second with zero idle cost. Workers scale to zero when not processing requests, which means we pay only for actual compute consumed.
But the pricing model is only part of the story. The bigger driver of margin improvement was the orchestration layer: intelligent GPU scheduling across providers, model quantization, and multi-step workflows combining open-source and closed-source models efficiently.
87% — Current Gross Margin | 40% → 87% — In 12 Months | 30%+ — Savings vs In-House
The steepest gains came in the first three to four months — primarily from eliminating silent job failures and their downstream costs in support, refunds, and churn. The curve has continued upward since, as optimization compounds over time.
Before Runflow: ~40% gross margin · 24-hour customer wait times · silent job failures · zero retry mechanisms.
After Runflow: 87% gross margin · reliable delivery · 10+ AI workflows running autonomously.
The organizational change is as significant as the financial one. BetterPic has no dedicated ML infrastructure team. No DevOps headcount focused on GPU management. Our engineers build product features — and when a new AI capability is ready, they ship it by calling an endpoint.
"Our developers can integrate a new AI feature by calling an endpoint. They don't need machine learning expertise, infrastructure knowledge, or DevOps skills. We just focus on making the best headshot product, and Runflow handles everything underneath."
— Thibaut Hennau, CEO, BetterPic
This is not a small thing. Every hour not spent debugging infrastructure is an hour spent improving the headshot product, expanding to new markets, or onboarding the next team customer. The leverage compounds.
We're sharing this because the problem we had — reliable, cost-efficient GPU orchestration for production AI workflows — is not unique to us. If you're running an AI product and your team spends meaningful time on infrastructure, retry logic, or quality control, the math on a managed orchestration layer is worth running.
Runflow's platform is now open to other teams. You can deploy any ComfyUI workflow as a production API, enable Sentinel quality evaluation, and get GPU pricing that undercuts the market — without building any of the supporting infrastructure yourself.
Every BetterPic headshot, background edit, and clothing swap runs through Runflow's infrastructure. If you're building with AI at scale, it's worth a look.
The numbers below reflect BetterPic's own operational data measured before and after migrating production AI workflows to Runflow. Margin improvement compounded over 12 months; delivery and reliability gains were visible within the first 30 days.
| Metric | Before Runflow | After Runflow |
|---|---|---|
| Gross Margin | ~40% | 87% |
| Customer Wait Time | Up to 24 hours | Reliable, fast delivery |
| Silent Job Failures | Frequent, undetected | Eliminated via automatic retry |
| Retry Mechanism | None | Automatic, cross-datacenter failover |
| AI Workflows Running | Limited, brittle pipelines | 10+ autonomous workflows |
| ML/DevOps Headcount | Required (or a product bottleneck) | Zero dedicated headcount needed |
| GPU Cost per Hour | ~$3.51/hr (market rate) | $1.95/hr on L40S, billed by the second |
The margin jump from 40% to 87% is not attributable to a single change. It reflects the cumulative effect of eliminating silent failures (and their downstream costs in refunds and churn), replacing idle GPU provisioning with per-second billing scaled to zero, and removing the operational overhead of managing infrastructure in-house.
Runflow is a ComfyUI deployment and orchestration platform built for production AI workloads. BetterPic uses it to run every AI inference job in the product — portrait generation, quality scoring, clothing swap, background removal, and targeted editing. Rather than managing GPU infrastructure directly, BetterPic calls a single API endpoint per workflow and Runflow handles routing, retry, failover, and quality evaluation automatically. The integration means BetterPic's engineers focus on product, not infrastructure.
The improvement came from three directions simultaneously. First, Runflow's L40S GPU pricing at $1.95 per hour — compared to a market rate of roughly $3.51 per hour — reduced raw compute cost by 44%. Second, per-second billing with automatic scale-to-zero means BetterPic pays only for actual compute consumed, not idle capacity. Third, eliminating silent job failures removed a significant hidden cost: support overhead, refunds, and customer churn caused by deliveries that simply never arrived. Combined, these factors drove gross margin from roughly 40% to 87% over 12 months.
No. BetterPic has no dedicated ML infrastructure or DevOps team focused on GPU management. This is a deliberate consequence of using Runflow. Because Runflow handles GPU routing, retry logic, model deployment, and quality evaluation, there is no operational need for engineers who specialize in those areas. BetterPic's engineers build product features. When a new AI workflow is ready, they ship it by calling an endpoint. The platform removes the organizational pressure to hire infrastructure specialists that typically accompanies AI product growth.
BetterPic currently runs more than 10 distinct AI inference pipelines through Runflow. The core workflow is AI headshot generation — taking customer selfies and producing 4K professional portraits in multiple styles. Supporting that are automated quality scoring via the Sentinel evaluation layer, clothing swap with four variations per edit, background change and removal for scene and compositing work, and Magic Fix for brush-based targeted editing. Each workflow is deployed from ComfyUI directly to Runflow's infrastructure and exposed as a typed API endpoint with auto-generated documentation and auto-scaling GPU.
Runflow's Sentinel layer runs after every inference job, before the output reaches the customer. It does not simply confirm that a job completed — it evaluates whether the result is actually acceptable. Eight specialized evaluation passes run per generation: face similarity scoring, segmentation analysis, pose analysis, and a set of LLM-based judges covering identity preservation, garment fit, skin realism, and background consistency. If an output fails any of these checks, Runflow retries the job automatically. If the retry passes, it ships. If it does not, the failure is surfaced to the BetterPic team before the customer sees anything. This is the only quality evaluation system of this type outside Google's Vertex AI enterprise platform.

Written by
Apoorv SharmaHead of Performance
Apoorv leads performance and growth at BetterPic with 9+ years of experience across SEO, SEM, and growth marketing. He oversees content strategy, data-driven marketing, and hands-on testing of AI headshot platforms. Previously held senior performance marketing roles across the US, Belgium, and India.

