You are viewing a preview of this job. Log in or register to view more details about this job.

AI-Data Pipeline Engineer (PLUTO IN AQUARIUS, Pack Intern, Summer 2026)

SUMMER 2026

**This internship is hosted by PLUTO IN AQUARIUS LLC and sponsored by the Nevada Career Studio (NCS).

Students are highly encouraged to visit the Nevada Career Studio during our drop-in hours or use our Virtual Resume & Cover Letter Review service BEFORE applying for these positions. Resumes and cover letters that do not meet NCS expectations will not be included in applicant packages to employers.

PLUTO IN AQUARIUS LLC

About PLUTO IN AQUARIUS LLC:

Pluto in Aquarius LLC (PIAVS) is a venture studio based in Reno, Nevada that builds AI-driven companies from the ground up. We combine domain expertise, proprietary data strategies, and modern AI development tools to create ventures that solve real, measurable problems at scale.

Our current venture is building the first standardized condition and risk intelligence platform for residential real estate. We use large language models to parse complex property documents, integrate property-level climate risk data from First Street Foundation, and generate a transparent, auditable 0-to-100 property condition score. The platform runs on Next.js, Vercel, and Supabase, with AI document intelligence at its core.

Summer 2026 interns will inherit a working demo built by our Spring 2026 cohort and take it from functional prototype to commercial-grade product. This is a production engineering challenge, not a greenfield build. That means hardening the architecture, closing security gaps, building for scale, and shipping features that real users will use at launch.

Our development stack is AI-native. We build with Claude Code for agentic software development, Cursor AI for intelligent code editing, and Replit for rapid prototyping and collaborative iteration. Security is built into the pipeline from day one: we treat vulnerability scanning, secret management, and DevSecOps practices as baseline requirements, not afterthoughts. As the platform moves into production, we are building the networking and load balancing architecture required for high availability, including DNS configuration, firewall rules, and application load balancing. Students with MLOps experience will find opportunities to contribute to our machine learning lifecycle infrastructure as we move from prototype scoring models toward production-grade predictive systems.

We move fast, give interns meaningful ownership of real product features, and expect higher-level thinking on architecture, security, and scale. If you want to work on hard problems with modern tools and see your code reach real users, this is the right place.

Internship Description:

Summer 2026 interns will take a working prototype built by our Spring 2026 cohort and deliver a commercial-grade platform ready for public launch. This is not a learning exercise. Interns will own real features, make real architectural decisions, and ship code that reaches real users before the internship ends.

The platform uses large language models to parse complex property documents, integrates third-party climate and property data APIs, and generates a transparent, auditable condition score with a full homeowner dashboard. The technical surface covers AI document intelligence, full-stack web development, cloud infrastructure, and data pipeline architecture.

Interns work directly with the founding team in a flat, fast-moving environment. There are no layers of management between an intern's code and production. Higher-level thinking on system design, security, and scalability is expected and will be used.

This role owns the intelligence layer. You will harden the LLM document parsing pipeline, improve structured data extraction from complex property documents, integrate First Street Foundation and ATTOM Data APIs, and build the scoring engine to production specification. You will define the data contracts that the Full-Stack Product Engineer builds against and collaborate with the Infrastructure Engineer on pipeline security and secrets management. Higher-level thinking on model reliability, data quality, and pipeline architecture is expected and will be used.

Duties/Responsibilities:

Complete 360 hours of work as an intern (32-40 hours per week, 9 weeks minimum)
Inherit and extend the LLM document parsing pipeline built during the Spring 2026 demo, with full context transfer from the Spring 2026 cohort
Harden and optimize structured data extraction from complex property documents including home inspection reports, warranty contracts, HOA governing documents, and permit records
Build the scoring engine to production specification including all six weighted scoring categories, confidence level calculation, and score change tracking
Integrate and maintain third-party data APIs including First Street Foundation climate risk models and ATTOM property records, with robust error handling, rate limit management, and data normalization
Define and document data contracts and API endpoints that the Full-Stack Product Engineer consumes to populate the homeowner dashboard
Build and maintain data validation, quality checks, and anomaly detection throughout the pipeline to ensure scoring accuracy and reliability
Collaborate with the Infrastructure Engineer on secure credential handling, secrets management, and pipeline security for all third-party API integrations
Contribute to MLOps infrastructure including model versioning, pipeline orchestration, and monitoring as the platform moves from prototype scoring models toward production-grade predictive systems
Write clean, documented, reviewable code following established contribution guidelines
Participate in code reviews, architectural discussions, and sprint planning with the founding team
Produce end-of-internship handoff documentation covering pipeline architecture, data contracts, incomplete work, and recommendations for the next development cycle

Goals and Expectations of the Intern:

By the end of the 12-week internship, this intern is expected to have:

Delivered a production-ready scoring engine with all six weighted categories functional, tested, and generating accurate, auditable scores at commercial launch
Built a hardened LLM document parsing pipeline that reliably extracts structured data from real property documents with documented accuracy metrics
Defined and delivered clean, stable data contracts and API endpoints that the Full-Stack Product Engineer and future engineers can build against without renegotiation
Integrated at least two third-party data APIs with robust error handling, rate limit management, and data normalization documented for future maintenance
Laid a functional MLOps foundation including model versioning and pipeline monitoring sufficient to support predictive capability development in the next development cycle
Produced architectural decision records covering all major pipeline and scoring engine choices made during the internship
Completed a structured end-of-internship knowledge transfer covering what was built, what was learned, and what remains for the next development cycle

Beyond deliverables, this intern is expected to operate with early professional-level independence. That means proactively identifying data quality problems before they corrupt scores, proposing solutions rather than waiting for direction, and communicating integration blockers to the Full-Stack Product Engineer early. The scoring engine is the core IP of the platform. The expectation is rigorous engineering judgment and scientific thinking about data reliability, not just feature delivery.

Interns who perform at a high level will be considered for continued engagement with PIAVS ventures beyond the summer cycle.

Required Qualifications:

Must be a degree-seeking undergraduate OR graduate student at the University of Nevada, Reno after the Spring ‘26 semester
Spring ‘26 or earlier graduates are not eligible for the Wolf Pack STEM Internship Program
Student must be enrolled in a major or minor program in the following colleges:
- Agriculture, Biotechnology, and Natural Resources (CABNR)
- Business
- Engineering
- Science
- Public Health/Orvis
Coursework in Computer Science, Data Science, Computer Engineering, or a related STEM field
Demonstrated proficiency in Python or JavaScript for data pipeline development, including experience building and maintaining data transformation and extraction workflows
Hands-on experience with large language models or natural language processing, including prompt engineering, structured data extraction from unstructured text, or document intelligence applications
Experience designing and consuming REST APIs including third-party API integration, error handling, and data normalization in a production or research context
Working knowledge of relational or managed database services including data modeling, querying, and schema design for structured data storage
Experience with version control and collaborative development workflows using Git and GitHub
Demonstrated ability to read, understand, and extend an existing codebase independently
Strong written and verbal communication skills sufficient for async remote collaboration, technical documentation, and cross-role coordination with frontend and infrastructure engineers.

Preferred Qualifications:

Experience with MLOps tooling and end-to-end machine learning lifecycle management including model versioning, pipeline orchestration, and production monitoring using platforms such as Kubeflow, SageMaker, or Azure ML
Familiarity with vector databases, embedding models, or retrieval-augmented generation patterns relevant to document intelligence applications
Experience with data quality frameworks, validation pipelines, or anomaly detection in production data systems
Hands-on experience with AI-native development tools including Claude Code, Cursor AI, or Replit
Experience with Supabase or equivalent managed database platforms in a production or research context
Familiarity with geospatial or climate risk data sources and the data structures used to represent property-level environmental risk
Experience building scoring, rating, or ranking systems that combine multiple weighted data inputs into a single composite output
Familiarity with DevSecOps practices as they apply to data pipelines, including secure API credential handling and secrets management
Prior internship, research, or project experience in proptech, fintech, insurtech, climate tech, or data-intensive consumer platforms
Demonstrated ability to deliver independently in a fast-moving, resource-constrained environment such as a startup, research lab, or competitive engineering program

Desired Schedule for Intern:

Full-time, 32-40 hours per week over 12 weeks, May 18 through August 8, 2026. Core collaboration hours are Monday through Friday, 9am to 3pm Pacific time, with flexibility outside that window for focused individual work. Interns are expected to be available for daily async check-ins and weekly synchronous team meetings during core hours.

The schedule is remote-first but interns based in the Reno area are encouraged to participate in periodic in-person working sessions with the founding team.

Pack STEM internships require interns to complete 360 hours during their internship. It is the intern’s responsibility to ensure this requirement is met.