Role Summary
This is an exciting opportunity for working at the interface between data management and generative AI. The project aims to develop new analytical infrastructure paradigms to support semantic querying and inference over heterogeneous and distributed data sources in the context of Pharma.
The vision behind the project is to extend the state-of-art in the direction of delivering infrastructures which allows for the creation of data-based digital twins, where end-users could perform seamless natural language analytical queries over semantically integrated data sources with an emphasis on relational, knowledge graph data sources.
Moreover, the project will develop a declarative/neuro-symbolic inference engine which allows for the implementation of the vision of a digital twin based on the semantically integrated datasets. The project will systematically establish and evaluate the complementary analytical value on the integration of the controlled use of LLMs and ontology/KG-based representations.
This project is developed in close collaboration with a research lab of a large pharmaceutical company, where the candidate will have the opportunity to work closely together with data analysts, software engineers and domain experts and develop MVPs aiming to achieve a Self-Service Insights platform.
Your Profile
We are looking for a self-motivated candidate who can address complex problems.
Mandatory requirements:
* BSc and MSc in Computer Science or related areas.
* Fully comfortable in developing complex systems in Python.
* Previous academic or industrial project experience on NLP or data management.
* Some days of physical presence of the candidate in the pharmaceutical company is expected.
Desirable:
* Previous published papers.
* Previous exposure to knowledge graphs and ontologies.