Deep Learning Engineer for Language Technologies (RE2)
Barcelona, TN Spain Context And Mission The Language Modeling Team at the newly created AI Institute hosted at BSC has consolidated experience in several NLP areas, such as massive language model building, biomedical text mining, and unsupervised learning for under-resourced languages and domains. It has been entrusted by the Spanish and the Catalan governments with the mission to develop fundamental open- source resources and technologies for Spanish and Catalan. In connection with this, the LM team is currently in charge of two flagship projects at the national and regional level: the ALIA project, funded by the Spanish Secretariat of Digitalisation and Artificial Intelligence, and the AINA project, aimed at developing AI resources for Catalan, funded by the Catalan Digitalisation Department. In addition, the Unit participates in various EU funded international projects. In this context, the newly created AI Institute is looking for talent to help design and improve its operations and infrastructure.
The Language Modeling Team is looking for candidates with a background in computational linguistics with experience in Language Technologies, specifically in Deep Learning and large language model building, and possibly in other areas of Natural Language and Speech Processing. The successful candidate will work in a highly sophisticated HPC environment, have access to state-of-the-art systems and computational infrastructures, and establish collaborations with experts in different areas at the local and international levels. The researcher will implement innovative techniques for language modeling and evaluation in the HPC environment. Key Duties - Work closely with colleagues on the design and implementation of solutions required to achieve the group’s goals.
- - Contribute to the pre-training and post-training of language models, ensuring high-quality and robust performance.
- - Deploy models for inference in the most efficient way possible, supporting internal evaluation, synthetic data generation, and experimentation workflows.
- - Optimize existing pipelines and workflows, maximizing cluster resource utilization and ensuring scalability.
- - Design and propose new research projects, translating innovative ideas into experiments and scientific publications.
- - Maintain high standards of code quality, documentation, and reproducibility, enabling the team to build upon previous work effectively.
- Requirements - Education - Master's degree in Computer Science, Mathematics, Machine Learning, or a related technical field.
- - - Essential Knowledge and Professional Experience - 5+ years of proven experience in Natural Language Processing projects.
- - 5+ years of professional experience in Python development.
- - Track record of publications at major AI venues.
- - Experience with full training pipelines, not just fine-tuning.
- - Advanced knowledge of High Performance Computing (HPC).
- - Hands-on experience with one or more of the following tools: PyTorch, Megatron-LM, NeMo, Transformers, vLLM.
- - - Additional Knowledge and Professional Experience - Research literacy and the ability to read, reproduce, and extend state-of-the-art papers.
- - Basic knowledge of mathematics and statistics applied to Deep Learning. - - Theoretical broad knowledge of Deep Learning techniques.
- - Knowledge of HPC workload managers such as Slurm.
- - Familiarity with monitoring tools such as Weights & Biases (W&B) or TensorBoard.
- - Knowledge of Continuous Integration / Continuous Delivery GitHub).
- - Hands-on experience with containerization Docker and/or Singularity).
- - Basic knowledge of other programming languages such as C++, Matlab and/or Java.
- - Experience with machine learning libraries, including PyTorch, Tensorflow, Pandas, Scikit-learn and/or Numpy.
- - Knowledge of GPU-based computing, including multi-gpu/multi-node parallelization techniques.
- - Fluency in spoken and written Catalan, Spanish and English.
- - Comfortable working in Linux environments.
- - - Competences - Capacity to explore new research lines.
- - Good communication and presentation skills.
- - Ability to collaborate effectively in team-based environments, including pair programming settings.
- - Conditions - The position will be located at BSC within the Life Sciences Department
-
- We offer a full-time contract a good working environment, a highly stimulating environment with state-of-the-art infrastructure, flexible working hours, extensive training plan, restaurant tickets, private health insurance, support to the relocation procedures
-
- Duration: Open-ended contract due to technical and scientific activities linked to the project and budget duration
-
- Holidays: 22 days of holidays + 6 personal days + 24th and 31st of December per our collective agreement
-
- Salary: we offer a competitive salary commensurate with the qualifications and experience of the candidate and according to the cost of living in Barcelona
-
- Starting date: 01/04/2026
-
EU