Multimodal Generative Modeling Research Engineer

Apple Inc

Santa Clara Valley (Cupertino), CA

Job posting number: #7150406 (Ref:apl-200477238)

Posted: May 15, 2023

Application Deadline: Open Until Filled

Job Description

Summary
Do you believe generative models can transform creative workflows and smart assistants used by billions? Do you believe it can fundamentally shift how people interact with devices and communicate?

We truly believe it can! We are looking for senior technical leaders experienced in architecting and deploying production scale multimodal ML. An ideal candidate has the ability to lead diverse cross functional efforts ranging from ML modeling, prototyping, validation and private learning. Solid ML fundamentals and an ability to place research contributions with respect to state of the art would be an essential part of the role. Experience with training and adapting large language models would be an important need.

We are the Intelligence System Experience (ISE) team within Apple’s software organization. The team works at the intersection between multimodal machine learning and system experiences. System Experience (Springboard, Settings), Keyboards, Pencil & Paper, Shortcuts are some of the experiences that the team oversees. These experiences that our users enjoy are backed by production scale ML workflows. Visual Understanding of People, Text, Handwriting & Scenes, multilingual NLP for writing workflows & knowledge extraction, behavioral modeling for proactive suggestions, and privacy preserving learning are areas our multi disciplinary ML teams focus on.


SELECTED REFERENCES TO OUR TEAM’S WORK:
- https://machinelearning.apple.com/research/stable-diffusion-coreml-apple-silicon (https://machinelearning.apple.com/research/stable-diffusion-coreml-apple-silicon)

- https://machinelearning.apple.com/research/on-device-scene-analysis (https://machinelearning.apple.com/research/on-device-scene-analysis)

- https://machinelearning.apple.com/research/panoptic-segmentation (https://machinelearning.apple.com/research/panoptic-segmentation)
Key Qualifications
  • Hands on experience training LLMs
  • Experience adapting pre-trained LLMs for downstream tasks & human alignment
  • Modeling experience at the intersection of NLP and vision
  • Familiarity with distributed training
  • Proficiency in ML toolkit of choice, e.g., PyTorch
  • Strong programming skills in Python, C and C++
Description
We are looking for a candidate with a proven track record in applied ML research. Responsibilities in the role will include training large scale multimodal (2D/3D vision-language) models on distributed backends, deployment of compact neural architectures efficiently on device, and learning policies that can be personalized to the user in a privacy preserving manner. Ensuring quality in the wild, with an emphasis on fairness and model robustness would constitute an important part of the role. You will be interacting very closely with a variety of ML researchers, software engineers, hardware & design teams cross functionally. The primary responsibilities of the role would center on enriching multimodal capabilities of large language models. The user experience initiative would focus on aligning image/video content to the space of LMs for visual actions & multi-turn interactions.
Education & Experience
M.S. or PhD in Computer Science, or a related fields such as Electrical Engineering, Robotics, Statistics, Applied Mathematics or equivalent experience.
Pay & Benefits




Apply Now

Please mention to the employer that you saw this ad on Sciencejobs.org