Postdoctoral Appointee - Large Scale Data Management and Storage for HPC/AI
Job posting number: #7098599 (Ref:413015)
Posted: April 14, 2022
Application Deadline: Open Until Filled
The Exascale Computing Project (ECP) is working closely with large scale scientific applications that are increasingly being driven by scalable deep learning (e.g., CANDLE – Cancer Deep Learning Environment) running on the largest supercomputers in the world. In this context, we develop efficient techniques to capture, manipulate and persist large amounts of data in a consistent and resilient fashion (some of which are illustrated by the VELOC project, a low overhead checkpointing system).
Currently, we are exploring a new data model centered around the notion of data states, which are intermediate representations of datasets automatically recorded into a lineage when tagged by applications with hints, constraints and persistency semantics. Such an approach enables the applications to focus on the meaning and properties of their data rather than how to access it, effectively reducing complexity while unlocking high performance and scalability for many use cases: finding and reusing previous intermediate results to explore alternatives, inspecting the evolution of datasets, verifying correctness, etc. This is especially important in the context of deep learning, where there is an acute need for advanced tools that explore many alternative DNN models and/or ensembles to improve accuracy, training speed and ability to generalize/explain a problem.
In addition to addressing such transformative challenges that arise at the intersection of HPC, big data analytics and machine learning, you will have the opportunity to work closely with many domain experts to identify the requirements and bottlenecks of real-life scientific applications that address the needs of our society over the next decades. In general, you will be part of a vibrant and diverse research community from more than 100 countries. Our lab hosts Aurora, one of the first Exascale supercomputers in the world, which you will have an opportunity to use for your experiments. In addition, you will have access to a large array of leading-edge experimental testbeds through the Joint Laboratory for System Evaluation (JLSE), which feature the latest technologies from top vendors like Intel, NVIDIA, AMD, etc.
- A recent or soon-to-be completed PhD degree (typically within the last three years)
- Familiarity with large scale deep learning techniques: data, model and pipeline parallelism.
- Ability to conduct interdisciplinary research at the intersection of HPC and deep learning and participate in teamwork and broad collaborative efforts involving other laboratories and universities, supercomputer centers and industry.
- Scientific background in distributed computing and HPC including:
- Strong code development skills with C/C++ and Python
- Familiarity with modern data management and I/O best practices
- Familiarity with machine/deep learning
Job FamilyPostdoctoral Family
Job ProfilePostdoctoral Appointee
Worker TypeLong-Term (Fixed Term)
Time TypeFull time
As an equal employment opportunity and affirmative action employer, and in accordance with our core values of impact, safety, respect, integrity and teamwork, Argonne National Laboratory is committed to a diverse and inclusive workplace that fosters collaborative scientific discovery and innovation. In support of this commitment, Argonne encourages minorities, women, veterans and individuals with disabilities to apply for employment. Argonne considers all qualified applicants for employment without regard to age, ancestry, citizenship status, color, disability, gender, gender identity, gender expression, genetic information, marital status, national origin, pregnancy, race, religion, sexual orientation, veteran status or any other characteristic protected by law.
Argonne employees, and certain guest researchers and contractors, are subject to particular restrictions related to participation in Foreign Government Sponsored or Affiliated Activities, as defined and detailed in United States Department of Energy Order 486.1A. You will be asked to disclose any such participation in the application phase for review by Argonne's Legal Department.
All Argonne offers of employment are contingent upon a background check that includes an assessment of criminal conviction history conducted on an individualized and case-by-case basis. Please be advised that Argonne positions require upon hire (or may require in the future) for the individual be to obtain a government access authorization that involves additional background check requirements. Failure to obtain or maintain such government access authorization could result in the withdrawal of a job offer or future termination of employment.
Please note that all Argonne employees are required to be vaccinated against COVID-19. All successful applicants will be required to provide their COVID-19 vaccination verification as a condition of employment, subject to limited legally recognized exemptions to COVID-19 vaccination.
Argonne is an equal opportunity employer, and we value diversity in our workforce. As an equal employment opportunity and affirmative action employer, Argonne National Laboratory is committed to a diverse and inclusive workplace that fosters collaborative scientific discovery and innovation. In support of this commitment, Argonne prohibits discrimination or harassment based on an individual's age, ancestry, citizenship status, color, disability, gender, gender identity, genetic information, marital status, national origin, pregnancy, race, religion, sexual orientation, veteran status or any other characteristic protected by law.