Software Engineer
DeepInfra is looking for early-career Software Engineers (0-2 years of experience, including internships) to join our team. You’ll work closely with our experienced engineers to design, build, and scale infrastructure for serving top open-source AI models. This role is ideal for recent graduates or junior engineers who want to grow quickly while working on high-impact, real production AI systems.
If you’re excited about AI/ML, have taken related courses or built projects, and want to learn how to ship things at scale - we’d love to meet you.
What You’ll Do
- Collaborate with engineers to design, develop, and test inference solutions for state-of-the-art AI models.
- Implement, optimize, and evaluate AI models using Python, C++, CUDA, and NCCL (previous exposure helpful - deep expertise not required).
- Monitor and maintain production model-serving systems.
- Work on new features, fix bugs, and contribute to code reviews.
- Participate in daily standups, design reviews, and team discussions.
- Explore new AI/ML techniques and tools, and experiment with improving model performance.
- Try new things. Ship stuff.
What You Bring
- Bachelor’s or Master’s degree in Computer Science, Computer Engineering, or a related field (completed or in final year).
- Strong fundamentals in data structures, algorithms, and software design.
- Proficiency in Python, including experience with AI/ML libraries and frameworks (e.g., NumPy, pandas, SciPy, TensorFlow, PyTorch).
- Experience with AI/ML through coursework, research, personal projects, full-time employment, or internships.
- Familiarity with AI models, Transformers and Diffusers.
- Experience with version control systems (e.g., Git) and agile development methodologies.
- Excellent problem-solving skills, with the ability to debug and optimize code.
- Strong communication and collaboration skills.
- Curiosity, willingness to learn, and desire to build real systems.
Bonus
- Exposure to C++, CUDA, or AI inference.
- Contributions to open-source ML projects.
Why DeepInfra
- Work on cutting-edge AI model serving - the systems that power the next generation of LLMs and multimodal models.
- Small team, huge impact: your work ships directly to customers.
- Opportunity to learn from engineers building high-performance inference at scale.
- Fast-paced environment with ownership, autonomy, and end-to-end responsibility.