Engineering
Featured

Senior AI Full-Stack Engineer

Hybrid · Ho Chi Minh City, Vietnam
Full Time
5+ years

Architect and deploy production-grade LLM agents, RAG pipelines, and agentic workflows. Lead the technical design of complex AI systems spanning backend orchestration, evaluation frameworks, and scalable deployment.

About the Role

We are building enterprise-grade LLM-driven agents for retrieval, reasoning, and workflow automation. This senior role focuses on architecting production AI systems, designing robust RAG pipelines, and leading the technical direction of our AI platform. You will own the end-to-end architecture of AI agent solutions - from designing multi-agent systems to building evaluation frameworks and ensuring production reliability at scale. You'll work at the intersection of modern AI frameworks and full-stack engineering, making critical technical decisions that shape our AI capabilities.

Key Responsibilities

RAG Architecture & Development

Architect and build production-grade RAG pipelines including ingestion, chunking strategies, embedding model selection, and retrieval optimization.

Design and implement vector search infrastructure (pgvector, Qdrant, Weaviate) with reranking, filtering, and citation systems.

Develop hybrid retrieval strategies combining semantic search, keyword matching, and metadata filtering.

Optimize RAG systems for latency, cost, and quality trade-offs in production environments.

Agent Architecture & Orchestration

Design complex multi-agent architectures using code-first frameworks (LangGraph, custom state machines).

Implement durable workflow orchestration with Temporal for long-running, fault-tolerant agent processes.

Build sophisticated tool-use patterns, function calling, and agent-to-agent communication protocols.

Design guardrails, safety boundaries, and human-in-the-loop approval flows for autonomous agents.

Evaluation & Production Systems

Design and implement evaluation frameworks with custom benchmarks, regression testing, and A/B experimentation.

Build observability infrastructure for agent monitoring (LangSmith, custom dashboards, alerting).

Own production deployment with SLOs, cost controls, and incident response playbooks.

Lead architectural decisions on model selection, prompt engineering patterns, and system reliability.

Inference & Model Serving

Design model-serving layers with vLLM, SGLang or TGI; tune continuous batching, KV-cache, and tensor-parallel settings for production throughput.

Route traffic across models and providers with liteLLM: fallbacks, cost tracking, rate-limit management.

Define REST and streaming API contracts for agent and retrieval endpoints with clear versioning and schemas.

Collaborate with platform engineers on GPU capacity, Ray Serve deployments, and horizontal scaling.

Full-Stack Leadership

Build intuitive React UIs for agent interaction, debugging, and system observability.

Develop backend APIs for retrieval, reasoning, and agent execution with high reliability.

Mentor junior engineers and establish best practices for AI application development.

Drive technical evaluations and build-vs-buy decisions for AI infrastructure.

Qualifications

Must-Have Technical Expertise

5+ years full-stack experience, with 2+ years building LLM/AI applications in production.

Proven experience architecting RAG systems: chunking, embeddings, vector stores, retrieval strategies.

Strong proficiency in Python for AI/ML workloads and TypeScript/Node.js for full-stack development.

Deep understanding of distributed systems, state management, and workflow orchestration.

Strong REST API design skills: versioning, pagination, idempotency, streaming (SSE), and schema-first documentation (OpenAPI).

Hands-on experience serving LLMs in production with vLLM, liteLLM, or comparable stacks (TGI, SGLang, Ray Serve).

Experience with production ML systems: monitoring, evaluation, versioning, and deployment.

Proficiency with AI-assisted development tools (Cursor, Claude Code, GitHub Copilot, or similar).

Architecture & Leadership

Ability to design systems that balance rapid iteration with production reliability.

Strong technical decision-making skills with clear trade-off analysis.

Experience mentoring engineers and driving technical standards across teams.

Ownership mentality for end-to-end system reliability and performance.

Preferred/Bonus

Experience with durable execution frameworks (Temporal, Inngest) for agent orchestration.

Knowledge of embedding model fine-tuning and domain-specific retrieval optimization.

Experience with evaluation frameworks (LangSmith, Opik, RAGAS, custom benchmarks, human-in-the-loop eval).

CI/CD experience for ML systems (model versioning, prompt regression testing).

Experience with compliance requirements (audit trails, data lineage, PII handling).

Strong Vietnamese and English communication skills.

Benefits

Competitive salary and performance incentives

Work on cutting-edge AI/LLM projects with real business impact

Advanced training in modern AI and platform technologies

Flexible work arrangements

A collaborative, innovative engineering team environment

Ready to Join Our Team?

We're excited to meet passionate engineers who want to build the future of AI. Apply now and let's create something amazing together.