Senior Data Platform Engineer
Build and operate production data orchestration platforms. Design metadata-driven pipelines, implement data transformations at scale, and integrate enterprise data systems.
About the Role
Key Responsibilities
Data Orchestration & Pipeline Development
Design and implement production data pipelines using Dagster for asset-first orchestration, with Airflow or Prefect where appropriate.
Build metadata-driven pipeline architectures with YAML-based configuration and dynamic asset generation.
Implement data transformation layers using dbt for SQL-based transformations and data modeling against Snowflake, Databricks, or BigQuery.
Create reusable pipeline components for ingestion, transformation, and data quality validation.
Integrate with enterprise data sources (databases, APIs, data lakes, streaming systems) using Airbyte connectors and custom adapters.
Cloud Data Platforms
Build and operate pipelines against modern cloud data platforms (Snowflake, Databricks lakehouse, BigQuery) with cost-aware design.
Implement Delta Lake / Iceberg table patterns (bronze / silver / gold) with schema evolution and partitioning discipline.
Design Airbyte ingestion topologies with normalisation, incremental syncs, and failure-recovery strategies.
Model data warehouse layers for analytical and AI consumption, balancing compute and serving cost.
Data Infrastructure & Integration
Build and maintain data catalog integrations for metadata discovery and governance.
Design and implement automated REST API generation from data warehouse views (PostgREST, Hasura) with clean contracts.
Develop connectors and adapters for diverse data sources and destinations.
Implement Row-Level Security (RLS) and data access control policies.
Build document ingestion pipelines for RAG systems including parsing, chunking, and embedding generation.
Production Operations
Own production reliability for data pipelines with monitoring, alerting, and incident response.
Optimize pipeline performance for scale (50TB+ data volumes, sub-second latency targets).
Implement data quality frameworks with automated testing and validation.
Design incremental processing strategies for efficient large-scale data updates.
Collaborate with DevOps on deployment automation and infrastructure provisioning.
Qualifications
Must-Have Technical Expertise
5+ years of data engineering experience, with 2+ years building production orchestration platforms.
Expert-level Python for data engineering workloads and pipeline development.
Deep experience with Dagster (preferred), Airflow or Prefect for asset-first orchestration.
Hands-on experience with at least one cloud data platform: Snowflake, Databricks, or BigQuery.
Production experience with Airbyte (or an equivalent ingestion tool such as Fivetran / Stitch) for connector-based extraction.
Strong SQL skills and experience with dbt for data transformation and modeling.
Strong REST API design intuition: contracts, versioning, idempotency, pagination.
Production experience with PostgreSQL, data warehouses, and database optimization.
Proficiency with AI-assisted development tools (Cursor, Claude Code, GitHub Copilot, or similar).
Platform & Integration Skills
Experience with metadata management and data catalog systems.
Understanding of REST API design and automated API generation patterns (PostgREST, Hasura).
Familiarity with vector databases and embedding pipelines for AI applications.
Experience with containerization (Docker) and orchestration basics (Kubernetes).
Strong understanding of data governance, access control, and compliance requirements.
Preferred/Bonus
Experience with Dagster components, assets, sensors, resources, and asset checks.
Deep work with Databricks (Unity Catalog, Delta Lake, MLflow) or Snowflake (Snowpark, Dynamic Tables).
Experience building reusable Airbyte connectors or contributing to the Airbyte connector framework.
Experience with document processing pipelines (OCR, parsing, chunking strategies).
Familiarity with DSPy or LLM integration in data pipelines.
Strong Vietnamese and English communication skills.
Benefits
Competitive salary and performance incentives
Work on cutting-edge data infrastructure projects
Advanced training in modern data engineering technologies
Flexible work arrangements
A collaborative, innovative engineering team environment
Ready to Join Our Team?
We're excited to meet passionate engineers who want to build the future of AI. Apply now and let's create something amazing together.