Senior Machine Learning (ML) Data Engineer
Pfizer Inc.
Cambridge, MA
Job posting number: #7113812 (Ref:pf-4847261)
Posted: October 13, 2022
Application Deadline: Open Until Filled
Job Description
Pfizer Worldwide Research, Development, and Medical (WRDM) is expanding its work in applying Machine Learning (ML) and Artificial Intelligence (AI) technologies to Biomedical Research. This will enable Pfizer to enhance our drug discovery efforts, sustain our industry leading R&D productivity, and deliver breakthrough medicines to the patients most in need. The centerpiece of this initiative is the establishment of a "ML Research Hub," a new group charged with mastering state-of-the-art machine learning techniques to create novel predictive models and tools used across WRDM. Data Engineering will be key to the success of this new group given the enormous scope of the chemical, biological, omics, and clinical data available at Pfizer. The WRDM Research Hub is seeking experienced data engineers with a background in machine learning, software engineering, technical problem-solving skills, and experience in creating scalable data pipelines and infrastructure for training, validating, and deploying into production ML solutions for broad usage.
Role Responsibilities
The successful candidate will help guide data engineering strategy and work with Research Unit and ML research scientists across WRDM to enable our proprietary data and external datasets to be leveraged for ML modeling. This will be accomplished by designing and implementing end-to-end data workflows for large-scale data ingestion, processing, tagging, and publishing, with an eye towards improving ML model performance over time. Key to success will be defining an internal portfolio of projects with collaborators, identification and management of external vendors and contractors to support activities, management of junior FTEs, and collaboration with Pfizer Digital.
Qualifications
Master’s degree in Computer Science, Statistics, Applied Mathematics, Chemistry, Physics, a life science discipline, or related technical discipline
Demonstrated ability for strategy development in the data engineering field. With the ability to lead an internal or external matrix team.
6+ years experience programming experience in Python, Java, Scala, C++, or SQL.
6+ years experience in software design, development, and algorithm-related solutions for production-grade systems using machine learning.
6+ years experience in managing code composed of multi-developer teams, following industry best practices
Deep knowledge of one or more scientific data types (e.g. biomedical images, biomedical text, large-scale, multidimensional 'omics, large- or small- molecule therapeutics, clinical or Real World Data, etc.)
Excellent written and oral communication skills
Preferred Qualifications
MS/PhD + 4 years of relevant research experience
Experience with high performance computing (HPC) environments (SLURM/LSF/SGE schedulers)
Expertise with cloud computing infrastructure including Amazon Web Services (AWS) and distributed computing libraries (e.g. Spark, Hive, Impala, Kafka, etc.)
Expertise with containerization and orchestration tools (e.g. Docker, Singularity, Airflow, Luigi, Kubernetes, etc)
Expertise with workflow languages (CWL, WDL, Nextflow, etc.)
Expertise with CI/CD and automation tools (Terraform, CloudFormation, Jenkins, Ansible, etc.)
Passion and curiosity for data and proven ability to take ideas from prototype to production.
Previous resource management experience (FTE, contractor)
Technologies We Use:
Python, Java, C++, Slurm-based on-premise compute clusters, Google Cloud Platform, AWS, Docker, Singularity, Kubernetes, Python (Numpy, Pandas, Dask, PyTorch, TensorFlow, sci-kit learn, RDKit, Weights and Biases etc.
Other Job Details:
- Additional Location Information: Cambridge, MA; La Jolla, CA; and Groton, CT
- Eligible for Relocation Package
- Eligible for Employee Referral Bonus
Pfizer requires all U.S. new hires to be fully vaccinated for COVID-19 prior to the first date of employment. As required by applicable law, Pfizer will consider requests for Reasonable Accommodations.
Sunshine Act
Pfizer reports payments and other transfers of value to health care providers as required by federal and state transparency laws and implementing regulations. These laws and regulations require Pfizer to provide government agencies with information such as a health care provider’s name, address and the type of payments or other value received, generally for public disclosure. Subject to further legal review and statutory or regulatory clarification, which Pfizer intends to pursue, reimbursement of recruiting expenses for licensed physicians may constitute a reportable transfer of value under the federal transparency law commonly known as the Sunshine Act. Therefore, if you are a licensed physician who incurs recruiting expenses as a result of interviewing with Pfizer that we pay or reimburse, your name, address and the amount of payments made currently will be reported to the government. If you have questions regarding this matter, please do not hesitate to contact your Talent Acquisition representative.
EEO & Employment Eligibility
Pfizer is committed to equal opportunity in the terms and conditions of employment for all employees and job applicants without regard to race, color, religion, sex, sexual orientation, age, gender identity or gender expression, national origin, disability or veteran status. Pfizer also complies with all applicable national, state and local laws governing nondiscrimination in employment as well as work authorization and employment eligibility verification requirements of the Immigration and Nationality Act and IRCA. Pfizer is an E-Verify employer.
Pfizer is committed to equal opportunity in the terms and conditions of employment for all employees and job applicants without regard to race, color, religion, sex, sexual orientation, age, gender identity or gender expression, national origin, disability or veteran status. Pfizer also complies with all applicable national, state and local laws governing nondiscrimination in employment as well as work authorization and employment eligibility verification requirements of the Immigration and Nationality Act and IRCA. Pfizer is an E-Verify employer.