Senior AI Full-Stack Engineer
Architect and deploy production-grade LLM agents, RAG pipelines, and agentic workflows. Lead the technical design of complex AI systems spanning backend orchestration, evaluation frameworks, and scalable deployment.
About the Role
Key Responsibilities
RAG Architecture & Development
Architect and build production-grade RAG pipelines including ingestion, chunking strategies, embedding model selection, and retrieval optimization.
Design and implement vector search infrastructure (pgvector, Qdrant, Weaviate) with reranking, filtering, and citation systems.
Develop hybrid retrieval strategies combining semantic search, keyword matching, and metadata filtering.
Optimize RAG systems for latency, cost, and quality trade-offs in production environments.
Agent Architecture & Orchestration
Design complex multi-agent architectures using code-first frameworks (LangGraph, custom state machines).
Implement durable workflow orchestration with Temporal for long-running, fault-tolerant agent processes.
Build sophisticated tool-use patterns, function calling, and agent-to-agent communication protocols.
Design guardrails, safety boundaries, and human-in-the-loop approval flows for autonomous agents.
Evaluation & Production Systems
Design and implement evaluation frameworks with custom benchmarks, regression testing, and A/B experimentation.
Build observability infrastructure for agent monitoring (LangSmith, custom dashboards, alerting).
Own production deployment with SLOs, cost controls, and incident response playbooks.
Lead architectural decisions on model selection, prompt engineering patterns, and system reliability.
Inference & Model Serving
Design model-serving layers with vLLM, SGLang or TGI; tune continuous batching, KV-cache, and tensor-parallel settings for production throughput.
Route traffic across models and providers with liteLLM: fallbacks, cost tracking, rate-limit management.
Define REST and streaming API contracts for agent and retrieval endpoints with clear versioning and schemas.
Collaborate with platform engineers on GPU capacity, Ray Serve deployments, and horizontal scaling.
Full-Stack Leadership
Build intuitive React UIs for agent interaction, debugging, and system observability.
Develop backend APIs for retrieval, reasoning, and agent execution with high reliability.
Mentor junior engineers and establish best practices for AI application development.
Drive technical evaluations and build-vs-buy decisions for AI infrastructure.
Qualifications
Must-Have Technical Expertise
5+ years full-stack experience, with 2+ years building LLM/AI applications in production.
Proven experience architecting RAG systems: chunking, embeddings, vector stores, retrieval strategies.
Strong proficiency in Python for AI/ML workloads and TypeScript/Node.js for full-stack development.
Deep understanding of distributed systems, state management, and workflow orchestration.
Strong REST API design skills: versioning, pagination, idempotency, streaming (SSE), and schema-first documentation (OpenAPI).
Hands-on experience serving LLMs in production with vLLM, liteLLM, or comparable stacks (TGI, SGLang, Ray Serve).
Experience with production ML systems: monitoring, evaluation, versioning, and deployment.
Proficiency with AI-assisted development tools (Cursor, Claude Code, GitHub Copilot, or similar).
Architecture & Leadership
Ability to design systems that balance rapid iteration with production reliability.
Strong technical decision-making skills with clear trade-off analysis.
Experience mentoring engineers and driving technical standards across teams.
Ownership mentality for end-to-end system reliability and performance.
Preferred/Bonus
Experience with durable execution frameworks (Temporal, Inngest) for agent orchestration.
Knowledge of embedding model fine-tuning and domain-specific retrieval optimization.
Experience with evaluation frameworks (LangSmith, Opik, RAGAS, custom benchmarks, human-in-the-loop eval).
CI/CD experience for ML systems (model versioning, prompt regression testing).
Experience with compliance requirements (audit trails, data lineage, PII handling).
Strong Vietnamese and English communication skills.
Benefits
Competitive salary and performance incentives
Work on cutting-edge AI/LLM projects with real business impact
Advanced training in modern AI and platform technologies
Flexible work arrangements
A collaborative, innovative engineering team environment
Ready to Join Our Team?
We're excited to meet passionate engineers who want to build the future of AI. Apply now and let's create something amazing together.