Coordinator: Rosa Figueiredo (LIA)
WP1 is a cornerstone in our endeavor, dedicated to the meticulous curation of scientific
publication records within our specified disciplinary scope, encompassing Computer Science, Political
Science, Economics, and Sociology. It is dedicated to collecting, organizing, and analyzing scientific
publication records in the project’s disciplinary scope. The ultimate goal is to construct an extensive
and comprehensive open-source database, serving as the bedrock for our subsequent research in the
project. The duration of WP1 spans the entirety of the project, ensuring not only the initial creation of
the database but also its evolution and adaptation over time, with provisions for re-engineering as
necessary based on our experience. All tasks within WP1 will involve Master’s apprentices (Master’s
students in work-study programs), recruited as part of the project.
The establishment of WP1 is driven by the imperative to mitigate risks inherent in data
collection and structuring. This imperative is underscored by the experiences of certain members
within our consortium who participated in the ANR project DeCoMaP, an interdisciplinary project
focused on Corruption Detection in Public Procurement Markets. The primary methodology employed
in DeCoMapinvolvedanetworkextraction step andthedevelopmentofpatternrecognition methods,
which required preliminary data collection and the establishment of a structured database. All these
tasks were assigned to the same work package and incorporated into a PhD thesis , as the risks
associated with data collection were considered low. However, this phase proved to be extremely
time-consuming, necessitating the allocation of additional resources for necessary re-engineering and
expansion of the database to prevent any impact on the development of new network analysis
The first two tasks of WP1 focus on data collection and author name disambiguation, essential
for establishing a complete and structured database, which ensures the reliability of the results
obtained in the final statistical analysis task. All tasks will be mainly addressed by LIA and ERIC since
they require expertise on collecting, structuring and analyzing the data. Task 1.1 (Data Collection) will benefit from the participation of ·JPEG to gain in-depth knowledge on the standards in all discipline fields considered in the project.
Deliverables :
- Statical analysis report.
- Open-source database of scientific publication records.
- Scripts for database creation and replication in other research areas.