ML/NLP Intern
We are looking for an ML/NLP Intern who is a Computer Science student (MS or PhD candidate) knowledgeable in Natural Language Processing, including deep knowledge of building NLP models using transformers and LLMs. We do not want to reinvent the wheel! We want to leverage some of the LLM models that have already been built and look at potentially training existing models. We have a problem statement to read in workshop transcription material and use that to fine-tune an open-source LLM. We are looking at using the HuggingFace or Databricks ecosystem. The student should be well-versed in using this ecosystem. If you are an international student on an F1 visa, we prefer candidates who are graduating in Spring 2025 (May timeframe) and can get on an OPT program.
Responsibilities:
- Research both computerized and non-computerized medical and public health literature in major national and international databases and academic journals.
- Work with staff on researching and preparing scientific and technical briefings, presentations, and other reports in response to various requests.
- Perform research on social media sites (e.g., Twitter, Facebook, YouTube, TikTok, etc) using quantitative and qualitative content analytics.
- Research and build connectors to the various social media sites using their open APIs.
- Should have expertise in working with mobile solutions around health care, including using the various APIs to extract health, GPS, mobility, and motion information from multiple apps.
- It should be able to extract, load, and transform the data retrieved to the server from the mobile device.
- Knowledge of building a mobile app for patient interaction
- Use the information retrieved from social media sites to build an NLP model using existing LLM and Transformer models using HuggingFace
Requirements:
- Bachelor’s degree in computer science. Currently pursuing a Master's or PhD in Computer Science (ML/AI)
- Certifications/licenses: Machine Learning and Natural Language Processing.
- Skilled in SQL, Adobe Acrobat, PowerPoint, Python, Excel, WebEx, Word, R, Linux, GIT/GitHub, deep learning framework (Pytorch), data science packages (NumPy, Pandas), AWS, transformers (Hugging Face), chatbots and dialogue systems.
- Experience in building AI-powered chatbots from scratch using AI, ML, and NLP technologies.
- Knowledge of Big Data Engineering, ETL
- Good time-management skills.
For the past eight years, CloudLeap Technologies, LLC has supported Federal Government customers by providing “cutting edge” big data & analytics, machine learning, artificial intelligence, identity and access management services, and middle-tier application development. In addition to providing award-winning and “cutting-edge” service, we also build AI and ML products to solve our customers’ complex and challenging data problems. We leverage our relationship with our Cloud partners to provide these services and product development on-premises and on the cloud. We are located in Baltimore, MD, and are a certified small HUBZone business.