You are viewing a preview of this job. Log in or register to view more details about this job.

CPU Workload Performance Optimization Engineer

We are seeking a CPU Workload Performance Optimization Engineer to drive the characterization, analysis, and optimization of CPU workloads for Tenstorrent’s cutting-edge processor products. In this role, you will work closely with architects, hardware designers, and software engineers to analyze CPU applications, enhance compilers and runtimes, and drive workload performance optimizations. Your contributions will directly shape the design and implementation of next-generation high-performance computing platforms across a diverse set of workloads.

This position can be based in the U.S. or Canada, either in one of Tenstorrent’s offices or remotely.

Key Responsibilities

  • Conduct competitive analysis to evaluate the strengths and weaknesses of compilers and runtimes for key workloads.

     
  • Analyze binary disassemblies and instruction traces to identify inefficiencies in RISC-V compiler and/or runtime optimizations.

     
  • Propose and prototype new performance optimization features in RISC-V compilers and/or runtimes.

     
  • Optimize key workload performance by fine-tuning compiler flags and runtime configurations.

     
  • Develop handwritten kernels using intrinsic programming or assembly to enhance performance on existing hardware.

     
  • Build and enhance open-source tools to automate binary code quality checks or instrument binaries for performance analysis.

     
  • Publish performance tuning guidelines and best practices for internal teams, external developers, and customers.

     
  • Stay up to date with industry trends, emerging workloads, and advancements in compiler optimization techniques.

Qualifications

  • Ph.D. in Computer Engineering, Electrical Engineering, or a related field.

     
  • Strong research background in static or dynamic compilation techniques, focusing on middle-end and/or backend optimization.

     
  • Deep expertise in GCC, LLVM, or JIT compiler design, development, and optimization.

     
  • Extensive experience in workload performance bottleneck troubleshooting and mitigation.

     
  • Solid background in handwritten kernel development using intrinsic or assembly programming.

     
  • Strong understanding of CPU microarchitecture, including superscalar pipelines, speculative execution, SIMD, and memory hierarchy.

     
  • In-depth knowledge of operating system internals and GNU libraries.

     
  • Proficiency in C/C++, intrinsic/assembly programming, and scripting languages such as Python and Shell.

     
  • Excellent problem-solving and communication skills, with the ability to work across multidisciplinary teams.

     

Bonus Qualifications

  • Experience with compute library kernel development.

     
  • Knowledge of vector-length agnostic programming.

     
  • Experience with binary instrumentation or binary translation.

     
  • Expertise in memory management and data layout optimization.