Où docteurs et entreprises se rencontrent
Menu
Connexion

Vous avez déjà un compte ?

Nouvel utilisateur ?

Generative AI methods to model Printing-and-Digitalization process

ABG-127858 Stage master 2 / Ingénieur 6 mois 4.35 euros per hour
10/01/2025
Université Lumière Lyon 2
Lyon Auvergne-Rhône-Alpes France
  • Informatique
Generative learning, image processing, printed documents
31/01/2025

Établissement recruteur

Founded in 1973, Université Lumière Lyon 2 welcomes nearly 30,000 students on its two campuses, ranging from undergraduate to doctoral level.

As a university of literature, languages, and human and social sciences, it is comprised of 13 teaching units spread over four main areas of teaching and research.

With 33 laboratories and four research federations, which cover the areas of literature, languages, and human and social sciences (LLSHS – Lettres, Langues, Sciences Humaines et Sociales), Université Lumière Lyon 2 bases its approach on innovation, interdisciplinarity, partnership and an international outlook.

Through the projects developed and coordinated by its 1000 researchers, the university would like to enable communication and discussion between the human and social sciences, on one hand, and the hard sciences, on the other, as well as to put research at the centre of current societal and scientific challenges.

Université Lumière Lyon 2 has a strong focus on international cooperation and currently has agreements with 350 institutions throughout the world. International students, whether part of an exchange programme or otherwise, account for more than 15% of the overall student body.
 

 

Description

Context of the study

Due to development and broad availability of high-quality printing and scanning devices, the number of forged or counterfeited products and documents is dramatically increasing.

Therefore, there is continuous research of new solutions to protect the documents and valuable products. One of the promising and accessible solution is the use of security printing.

When an electronic document is printed and scanned one/several times, a slightly different image of the document - due to the optical characteristics of the capture devices - is obtained each time. We can use this information loss principle to identify the devices used for production of the document hardcopy.

In this case, the authentication/identification systems take a full advantage of Printing-and-Digitalization (PD) process. In current stage, we need to print a big quantity of samples to learn the authentication detectors or forgery detectors. Nevertheless, the dataset construction process is very expensive and time-consuming. Additionally, the data collection process requires dedicated personnel and very strict procedures. Therefore, we are looking for solutions that can help to learn a surrogate model of PD process and create a big public dataset of printed documents.

This internship is a part of ANR project TRUSTIT: Theoretical and practical study of physical object security in real world use cases that aims to explore the potential offered by deep learning methods in the context of secure printing from the verifier's point of view.

Description of the subject

The generative neural networks (GANs) and probabilistic latent diffusion models recently have

showed their efficiency in data generation and style transfer [1,2]. During this internship we will work on learning a surrogate representation of the degradations added during Printing-and-Digitalization (PD) process to the printed documents. The main tasks of this internship are:

1) To learn a surrogate representation of one pair printer-scanner using existing large dataset of L3iTextCopies [3].

2) To experiment with different architectures of GANs and probabilistic diffusion approaches to identify the best method for our task.

3) To compare the pseudo-synthetic samples with real printed documents using some commonly used metrics as Pearson correlation, Mean square error (MSE) distance and Fréchet Inception Distance (FID) between the datasets [4].

4) To evaluate the possibility of fine-tuning the proposed models for unseen pairs of printer and scanner.

5) To create a public synthetic dataset of printed documents and if possible, to publish the results in the international conference or scientific journal.

Place and allowance of internship

The internship will be held in LIRIS (Laboratoire d’Informatique en Image et Systèmes d’information) laboratory, campus of Université Lumière Lyon 2, Bron. Internship allowance is 4.35 euros per hour.

References

[1] R. Yadav, I. Tkachenko, A. Trémeau, T. Fournel, Copy Sensitive Graphical Code Estimation: Physical vs Numerical Resolution, IEEE WIFS 2019.

[2] R. Ratajczak, C. Crispim-Junior, and al., Pseudo-Cyclic Network for Unsupervised Colorization with Handcrafted Translation and Output Spatial Pyramids, SUMAC at ACM Multimedia 2019.

[3] S. Eskenazi, P. Gomez-Krämer, J-M. Ogier, A Study of the Factors Influencing OCR Stability for Hybrid Security. IWCDF at ICDAR 2017: 3-8.

[4] Y. Belousov, B. Pulfer, R. Chaban, J. Tutt, O. Taran, T. Holotyak, S. Voloshynovskiy, “Digital twins of physical printing-imaging channel”, IEEE WIFS 2022.

Profil

  • The candidate must currently be enrolled in a Master 2 program or in the final year of engineering school (that corresponds to Bac+5 in France) in Computer Science.
  • Programming languages: Python.
  • Libraries for image analysis and processing: OpenCV, scikit-image (Python).
  • Machine learning frameworks: scikit-learn, PyTorch.
  • Scientific knowledge: signal processing, image analysis, machine learning and deep
  • learning.
  • Knowledge on multimedia security will be considered a plus.
  • Languages: French or English.

Prise de fonction

03/03/2025
Partager via
Postuler
Fermer

Vous avez déjà un compte ?

Nouvel utilisateur ?