You are viewing a preview of this job. Log in or register to view more details about this job.

(PhD) Senior engineer of multimodal algorithms

Job responsibilities

1. Tracking, implementation, and best practice process standardization of recent algorithms in the fields of multi-modal large model understanding and generation, image and text information extraction, screen understanding, and other industry applications;

2. Research on corresponding new algorithm frameworks to solve core technical problems in the technical field;

3. Work closely with the R&D team to apply multi-modal understanding and generation technology to actual projects to meet cutting-edge innovation goals or business goals.

Job requirements

1. Computer science, electronic engineering, mathematics and other related fields;

2. Have in-depth research on at least one of the following research directions: multi-modal feature learning, target detection/Grouding, OCR, image and video description, VQA, LLM training, RL, etc.;

3. Proficient in using mainstream deep learning frameworks such as PyTorch/TF;

4. Have a good foundation in algorithms and data structures; have a solid foundation in mathematics and statistics, and be familiar with knowledge in optimization, probability, linear algebra and other related fields;

5. Have good teamwork spirit and communication skills, and be able to work independently in a team environment;

6. Be proactive, have good problem-solving and innovation abilities, and be passionate about cutting-edge technologies.

Working language is Mandarin.