You are viewing a preview of this job. Log in or register to view more details about this job.

Software Engineer

DeepInfra is looking for early-career Software Engineers (0-2 years of experience, including internships) to join our team. You’ll work closely with our experienced engineers to design, build, and scale infrastructure for serving top open-source AI models. This role is ideal for recent graduates or junior engineers who want to grow quickly while working on high-impact, real production AI systems.

If you’re excited about AI/ML, have taken related courses or built projects, and want to learn how to ship things at scale - we’d love to meet you.

What You’ll Do

Collaborate with engineers to design, develop, and test inference solutions for state-of-the-art AI models.
Implement, optimize, and evaluate AI models using Python, C++, CUDA, and NCCL (previous exposure helpful - deep expertise not required).
Monitor and maintain production model-serving systems.
Work on new features, fix bugs, and contribute to code reviews.
Participate in daily standups, design reviews, and team discussions.
Explore new AI/ML techniques and tools, and experiment with improving model performance.
Try new things. Ship stuff.

What You Bring

Bachelor’s or Master’s degree in Computer Science, Computer Engineering, or a related field (completed or in final year).
Strong fundamentals in data structures, algorithms, and software design.
Proficiency in Python, including experience with AI/ML libraries and frameworks (e.g., NumPy, pandas, SciPy, TensorFlow, PyTorch).
Experience with AI/ML through coursework, research, personal projects, full-time employment, or internships.
Familiarity with AI models, Transformers and Diffusers.
Experience with version control systems (e.g., Git) and agile development methodologies.
Excellent problem-solving skills, with the ability to debug and optimize code.
Strong communication and collaboration skills.
Curiosity, willingness to learn, and desire to build real systems.

Bonus

Exposure to C++, CUDA, or AI inference.
Contributions to open-source ML projects.

Why DeepInfra

Work on cutting-edge AI model serving - the systems that power the next generation of LLMs and multimodal models.
Small team, huge impact: your work ships directly to customers.
Opportunity to learn from engineers building high-performance inference at scale.
Fast-paced environment with ownership, autonomy, and end-to-end responsibility.