Data Science

Data Scientist – Agentic RAG & LLM (Databricks / Azure / AWS) - Australia & New Zealand

Singapore
Work Type: Full Time
Location: Australia & New Zealand (candidates must have valid working rights in either country)


Position Overview



We are seeking a highly skilled Data Scientist with strong expertise in Databricks, Azure, and AWS, specializing in Agentic Retrieval-Augmented Generation (RAG) and Large Language Models (LLMs). The role focuses on designing and productionizing intelligent AI/ML systems with scalable, cloud-native deployments, CI/CD pipelines, and MLOps best practices.

The ideal candidate is hands-on, solution-oriented, and experienced in building and deploying advanced AI systems across multiple cloud platforms.




Key Responsibilities



  • Design and implement Agentic RAG pipelines using Databricks Vector Search, MLflow, Unity Catalog, integrated with Azure Cognitive Search and AWS OpenSearch.
  • Develop agent-based workflows using LangChain, LangGraph, LlamaIndex, and other tool-augmented reasoning frameworks.
  • Fine-tune, evaluate, and deploy LLMs (OpenAI, Anthropic, MosaicML, Hugging Face, Llama) for enterprise applications.
  • Build CI/CD pipelines for ML & GenAI workloads, including:

    • Automated build/test/deploy workflows (Azure DevOps, GitHub Actions, Jenkins, AWS CodePipeline).
    • MLflow model registry integration with production/staging environments.
    • Infrastructure-as-Code (IaC) using Terraform, Bicep, or CloudFormation for reproducible deployments.

  • Implement MLOps best practices: experiment tracking, versioning, continuous evaluation, automated retraining pipelines.
  • Ensure data governance, compliance, and security for sensitive datasets across Azure and AWS.
  • Collaborate with engineering and product teams to integrate ETL/ELT pipelines in Azure Data Factory, Synapse, AWS S3, Redshift, Glue.
  • Deploy and monitor models with online evaluation pipelines (MLflow Evaluate, DeepEval, custom scorers such as faithfulness, retrieval recall).
  • Provide technical mentorship on GenAI architecture, CI/CD, and production-grade LLM deployments.





Required Skills & Qualifications



  • Bachelor’s or Master’s degree in Data Science, Computer Science, AI/ML, or related fields (PhD optional, not mandatory).
  • 4+ years of professional experience delivering ML/AI or data science solutions, including cloud-native deployments.
  • Strong expertise with the Databricks ecosystem: Spark (PySpark/Scala), Delta Lake, Unity Catalog, MLflow, Vector Search.
  • Hands-on experience with CI/CD pipelines for ML and GenAI:

    • Azure DevOps, GitHub Actions, or Jenkins.
    • Automated testing for ML pipelines.
    • Model promotion workflows (dev → staging → prod).

  • Proficiency in Python, SQL, distributed data processing, and cloud-native ML frameworks.
  • Deep experience with Azure ML, Data Factory, Synapse, Data Lake and AWS SageMaker, Glue, S3, Redshift.
  • Strong knowledge of LLM orchestration frameworks (LangChain, LangGraph, LlamaIndex).
  • Solid understanding of LLM & RAG evaluation metrics (faithfulness, token-F1, citation@k).
  • Must have valid working rights in Australia or New Zealand.





Preferred Qualifications



  • Experience deploying multi-agent LLM systems in production.
  • Familiarity with Infrastructure-as-Code (Terraform, Bicep, CloudFormation) for CI/CD automation.
  • Hands-on experience with containerization and orchestration (Docker, Kubernetes, AKS, EKS).
  • Contributions to open-source GenAI/LLM projects or published research.

Submit Your Application

You have successfully applied
  • You have errors in applying