You are viewing a preview of this job. Log in or register to view more details about this job.

Post Masters Research Associate - Data Science & Machine Intelligence

Overview

At PNNL, our core capabilities are divided among major departments that we refer to as Directorates within the Lab, focused on a specific area of scientific research or other function, with its own leadership team and dedicated budget.  

Our Science & Technology directorates include National Security, Earth and Biological Sciences, Physical and Computational Sciences, and Energy and Environment. In addition, we have an Environmental Molecular Sciences Laboratory, a Department of Energy, Office of Science user facility housed on the PNNL campus. 

The Advanced Computing, Mathematics, and Data Division (ACMDD) focuses on basic and applied computing research encompassing artificial intelligence, applied mathematics, computing technologies, and data and computational engineering. Our scientists and engineers apply end-to-end co-design principles to advance future energy-efficient computing systems and design the next generation of algorithms to analyze, model, understand, and control the behavior of complex systems in science, energy, and national security. 

Responsibilities

The Data Science and Machine Intelligence Group is seeking a highly motivated post-Masters Research Associate with a strong background in Natural Language Processing (NLP) to join our dynamic team. The research will entail working closely with other researchers on design, development and application of NLP methods to solve challenging domain problems. In particular, focus will be on the training and evaluation of large language models and creation of LLM-based agents and multi-agent systems. Preferred skills will include familiarity and prior experience in data science, high performance computing, high level languages such as Python, and AI/ML libraries such as LangChain, LangGraph, PyTorch and Tensorflow.

  • Application and training/evaluation of large language models.
  • Creating LLM Agents and agentic systems.
  • Natural language processing and prompt engineering.
  • General data science, data curation and data processing.
  • Applied mathematical principles to identify trends in data sets.


Qualifications
Minimum Qualifications:

  • Candidates must have received a Master’s degree within the past 24 months or within the next 8 months from an accredited college or university.  

Preferred Qualifications:

  • Master’s degree in Computer Science, Data Science, Applied Mathematics, or a closely related field, with strong emphasis on Natural Language Processing (NLP) and Machine Learning.
  • Strong record of peer-reviewed publications in Natural Language Processing (NLP), Large Language Models (LLMs), or related fields, demonstrating thought leadership and contributions to the research community.
  • Advanced proficiency in high-level programming, particularly Python, with hands-on experience using leading NLP and ML libraries (e.g., Hugging Face, LangChain) and modern software practices such as containerization (Docker).
  • Proven ability to design, train, and deploy models using cloud platforms (AWS, Azure) and machine learning frameworks (e.g., PyTorch, TensorFlow).
  • Demonstrated success in advancing scientific discovery through research outputs such as publications, open-source software, or innovative applied solutions.
  • Prior experience working with large unstructured datasets, including text and images extracted from PDFs and developing Retrieval-Augmented Generation (RAG) pipelines is highly desirable.