You are viewing a preview of this job. Log in or register to view more details about this job.

RAG Engineering Intern

RAG Engineering Intern (Python / Azure OpenAI)

 

Alphius | Paid Internship | Remote (US Required)

Alphius is building production-grade Retrieval-Augmented Generation (RAG) systems on Microsoft Azure. We are looking for a RAG Engineering Intern to work on document ingestion, vector search, and LLM-powered APIs within a cloud-native architecture.

This is a hands-on engineering internship. You will work on real retrieval pipelines, embedding systems, and inference services — not just notebooks or experiments.

 

What You’ll Work On

  • Build and improve document ingestion pipelines (PDF, HTML, DOCX, CSV)
  • Implement chunking strategies (recursive, semantic, sliding window)
  • Work with vector databases (e.g., Azure AI Search, Qdrant, Pinecone)
  • Assist in building hybrid retrieval (BM25 + embeddings)
  • Support prompt engineering for retrieval-grounded QA
  • Develop FastAPI-based inference endpoints
  • Help monitor latency, cost, and retrieval quality
  • Deploy services to Azure Container Apps or AKS
  • Use Application Insights for tracing RAG pipelines
  • You will contribute to systems used in real production workflows.

 

Required Qualifications

  • Currently pursuing a degree in Computer Science, Machine Learning, or related field
  • Strong Python skills (3.10+ preferred)
  • Experience building projects involving LLMs or embeddings
  • Familiarity with async programming (asyncio)
  • Understanding of REST APIs
  • Experience with Git

 

Preferred Experience

  • Exposure to RAG pipelines or vector databases
  • Experience with Azure OpenAI or OpenAI APIs
  • Familiarity with FastAPI
  • Basic knowledge of embeddings and retrieval concepts
  • Exposure to Docker and cloud deployments
  • Understanding of JSON schema validation or structured outputs

 

What You’ll Gain

  • Real-world experience building production RAG systems
  • Exposure to Azure OpenAI and cloud-native AI architecture
  • Experience with vector search and hybrid retrieval
  • Hands-on work with scalable LLM APIs
  • Mentorship from senior engineers
  • Opportunity for full-time conversion based on performance

 

Internship Details

  • Paid internship ($1,200-$2,200 monthly)
  • Remote
  • 15–25 hours per week during semester; up to 40 during summer
  • Duration: Summer or ongoing

 

How to Apply

  • Please include:
  • Resume
  • GitHub or project portfolio
  • Brief description of an LLM or RAG-related project you’ve built