Data & Search Engineer - Immediate joiners

Abu Dhabi, United Arab Emirates
Full Time
Experienced

We are looking for a strong Data & Search Engineer to design, build, and operate the data ingestion, indexing, and retrieval foundation for enterprise AI and Agentic AI solutions.

This role is critical for enabling accurate, secure, and scalable AI-powered search, document intelligence, knowledge retrieval, and RAG-based applications. The ideal candidate should have hands-on experience in handling structured and unstructured enterprise data, designing chunking and enrichment strategies, optimizing search relevance, and validating retrieval quality.

The candidate should be comfortable working with large volumes of documents, enterprise metadata, security-aware indexing, hybrid search, and Azure AI services.

  • Key Responsibilities
    Data Ingestion & Processing
    Design and build scalable ingestion pipelines for structured, semi-structured, and unstructured data sources such as PDFs, Word documents, Excel files, SharePoint, databases, APIs, and enterprise repositories.
  • Develop robust document parsing, cleaning, normalization, and transformation workflows.
  • Implement document chunking strategies based on structure, sections, headings, tables, document type, and business context.
  • Maintain document identifiers, source references, version history, and lineage information across ingestion and indexing workflows.
  • Metadata, Enrichment & Governance
    Design metadata schemas for enterprise search and RAG use cases.
  • Enrich content with document-level, section-level, topic-level, and security-level metadata.
  • Implement tagging, classification, topic extraction, entity extraction, and semantic enrichment pipelines.
  • Ensure support for RBAC-aware retrieval, data masking, access control filtering, and secure indexing practices.
  • Search, Indexing & Retrieval
    Build and tune hybrid search solutions combining semantic search, vector search, and keyword-based search.
  • Design and maintain indexes for enterprise-grade retrieval performance.
  • Work with vector databases, Azure AI Search, embeddings, and ranking strategies.
  • Optimize retrieval relevance using filters, scoring profiles, reranking, metadata boosts, and query expansion.
  • Evaluate chunk quality, index quality, and retrieval performance through systematic testing.
  • Retrieval Quality & Evaluation
    Define and execute retrieval evaluation frameworks using relevance metrics, test query sets, golden datasets, and human review feedback.
  • Identify issues such as poor chunking, missing metadata, irrelevant retrieval, duplicate chunks, hallucination risk, and low-confidence answers.
  • Continuously improve ingestion and indexing strategies based on evaluation results.
  • Support RAG and Agentic AI teams with reliable, explainable, and traceable retrieval foundations.

Required Skills & Experience

  • Strong experience in enterprise data ingestion, search engineering, indexing, and retrieval.
  • Hands-on knowledge of document chunking, metadata modeling, content enrichment, and data preprocessing.
  • Experience with hybrid search: semantic search, vector search, full-text search, and keyword search.
  • Strong understanding of embeddings, vector indexing, similarity search, and relevance tuning.
  • Experience with Azure AI Search, Azure OpenAI, Azure AI Document Intelligence, Microsoft Fabric, SharePoint, Microsoft Graph, or related Microsoft AI services.
  • Experience with Python and data processing frameworks.
  • Good understanding of data masking, access control, RBAC-aware search, and secure data handling.
  • Experience working with enterprise documents, knowledge bases, policies, SOPs, contracts, engineering documents, or operational data.
  • Ability to validate retrieval quality and improve search accuracy through structured evaluation.

Preferred Technical Stack
Microsoft Azure AI Search
Azure OpenAI Service
Azure AI Document Intelligence
Azure Functions / Azure Container Apps
Microsoft Graph API
SharePoint / OneDrive / Teams data integration
Microsoft Fabric / Synapse / Data Factory
Python
SQL / PostgreSQL / SQL Server
Vector search and embedding models
LangChain / Semantic Kernel / LlamaIndex
Power BI integration awareness is a plus

Share

Apply for this position

Required*
We've received your resume. Click here to update it.
Attach resume as .pdf, .doc, .docx, .odt, .txt, or .rtf (limit 5MB) or Paste resume

Paste your resume here or Attach resume file

Human Check*