handoff

End of page. Start of the working loop.

If this site did its job, the next step is simple: download the public CV, read the field notes, or check the public proof. I am looking for AI product work where the model is only one part of the system.

Download public CVRead the notes
available for
Staff / Principal AI engineering roles
best work
harnesses, retrieval, evals, workflow control
base
Madrid / remote-first
paths
writingprojectsabout
contact
emailLinkedInGitHubX
context before autonomyevals before confidenceproof beats claimscode must stay legiblesystems over demospublic CV availablecontext before autonomyevals before confidenceproof beats claimscode must stay legiblesystems over demospublic CV available
(c) 2026 Petru Arakisspublic surface for product engineering work
petruarakiss
aboutprojectswriting
open to Staff / Principal AI engineering roles

Petru Arakiss

Production AI systems: retrieval, agents, guardrails.

Madrid-based product engineer working across UX, backend, retrieval infrastructure, evals, guardrails, and agent runtimes.

I use Codex and Claude inside the engineering loop, but the standard is still clear product judgment, reliable software, and systems people can inspect when the model is wrong or unsure.

Download public CVLinkedInGitHub
role signal
best fit

Staff / Principal roles where AI has to become product infrastructure, not a demo layer.

current edge

retrieval infrastructure, agent runtimes, guardrails, evals, traces, and cost-aware systems.

working range

full-stack product ownership across UX, backend, data, and operations.

operating principles
  • runtime before agent hype
  • context is a product surface
  • evals before confidence
  • code must stay legible
current proof

Proof outside the demo.

The public version is intentionally compact. The pattern is the same across the private work: build the product surface, define the runtime, keep evidence visible, and make failure recoverable.

01

BIFROST

document evidence

Document intelligence layer with ingestion quality gates, semantic and visual retrieval, pgvector/HNSW search, caching, source quality, and honest no-answer behavior.

02

ORVIAN

agent runtime

AI workflow runtime with context assembly, durable memory, execution tiers, run events, queues, idempotency, and human-review metadata.

03

Polaris

internal search

Internal AI assistant product that combines BIFROST retrieval, MONARCH guardrails, cached tool handoff, citations, streaming UX, and suggestion revalidation.

positioning

The model is one component.

The useful work is usually the system around it: how it gets context, chooses tools, exposes uncertainty, hands work back to people, records what happened, and stays understandable after months of changes.

AI product engineering

I build products where language models are useful components inside a larger system: permissions, state, queues, interfaces, persistence, observability, cost control, and failure handling.

Agentic workflow design

I design runtime state, tool use, handoffs, evaluator loops, human checkpoints, idempotency, and stopping conditions for workflows that need judgment without becoming uncontrolled automation.

Harness engineering

I shape the environment around Codex, Claude, and human engineers: repository knowledge, executable plans, review loops, browser checks, traces, evals, and CI guardrails.

Context and retrieval systems

I work on the less glamorous parts of retrieval: document ingestion, chunking strategy, metadata, semantic and visual search, grounding, permission boundaries, citations, source quality, and honest no-answer paths.

Eval-driven reliability

I turn vague quality into examples, traces, graders, regression checks, and operating thresholds so teams can improve systems instead of arguing from screenshots.

Full-stack delivery

I can own the product surface and the backend path: Next.js, TypeScript, Python, FastAPI, Postgres, Redis, background jobs, deployment, monitoring, and cost control.

working model

Fast with AI, strict with the work.

I use AI tools aggressively, but not as a substitute for judgment. The leverage comes from designing a loop where humans, agents, and the running system can all give useful feedback.

With people

I clarify the goal, the risk, and what must stay under human control. The point is not to automate everything; it is to make the right work easier to trust.

With agents

I use Codex and Claude as engineering collaborators, but I design the harness around them: context, tools, tests, reviews, and observable feedback from the running system.

With systems

I care about boundaries, data contracts, failure modes, latency, cost, and the operational screen someone will use when the model is wrong or unsure.

best fit

For teams where AI has to become a product, not a slide.

  • Staff or Principal Engineer for AI-native product teams
  • AI Engineering Lead for agentic workflows and internal tools
  • AI Platform / Developer Productivity roles building agent harnesses
  • Full-stack product ownership where AI, UX, and backend reliability meet
Download public CVLinkedIn
public surface
agentic workflows

runtime state, tool use, traces, handoffs, human review, queues, and stopping conditions

retrieval infrastructure

ingestion, chunking, pgvector, visual search, cache strategy, citations, and no-answer paths

full-stack systems

Python, FastAPI, Next.js, TypeScript, Postgres, Redis, observability, CI

writingprojectspublic OSS: gommage, traceframe, vestigpast work: BBVA, Santander, Bankinter, El Corte Ingles