Juan A. Rodriguez

Artificial Intelligence Researcher

prof_pic.jpg

ServiceNow Research

Mila, Quebec AI Institute

École de Technologie Superieure, University of Quebec

Montreal, Quebec, Canada

Hi, I am Juan Rodriguez, you can also call me Joan (in Catalan).

I am a Researcher at ServiceNow Research and a PhD student at Mila and École de Technologie Superieure, University of Quebec. I am based in Montreal, Canada, but I am from Barcelona, Spain. I am advised by Prof. Marco Pedersoli, Prof. Chris Pal, and Dr. David Vazquez

My research interests are in the intersection of Computer Vision and Natural Language Processing, with a focus on multimodal generative models. I am interested in learning how to leverage information from different modalities to generate more accurate and controlable outputs in all modalities. Recently I have been working with large language and vision models using transformers and diffusion. I have also been exploring the paradigm of generating code as an alternative for generating images (e.g. scalable vector graphics).

I obtained a M.Sc. in Computer Vision from Universitat Autònoma de Barcelona (UAB) with honors for my master thesis Text to Scientific Figure Generation performed at ServiceNow Research and advised by Dr. David Vazquez and Dr. Pau Rodríguez. Previously, I obtained a B.Sc. in Telecommunication Networks Engineering from Universitat Pompeu Fabra (UPF) and carried out my bachelor thesis on Handwritten Text Recognition advised by Prof. Xavier Binefa. I did research internships at UPF advised by Prof. Xavier Binefa and Prof. Miquel Oliver, and at CVC-UAB advised by Prof. Joost van de Weijer.

Publications

  1. starvector.png
    StarVector: Generating Scalable Vector Graphics Code from Images and Text
    Juan Rodriguez, Abhay Puri, Shubham Agarwal, Sai Rajeswar, Issam H. Laradji, Pau Rodriguez, David Vazquez, Christopher Pal, and Marco Pedersoli
    In preprint, 2024
  2. bigdocs.png
    BigDocs: An Open and Permissively-Licensed Dataset for Training Multimodal Models on Document and Code Tasks
    Juan Rodriguez, Xiangru Jian, Siba Smarak Panigrahi, Tianyu Zhang, Aarash Feizi, Abhay Puri, Akshay Kalkunte, François Savard, Ahmed Masry, Shravan Nayak, and 33 more authors
    2024
  3. ocr-vqgan.png
    OCR-VQGAN: Taming text-within-image generation
    Juan Rodriguez, David Vazquez, Issam Laradji, Marco Pedersoli, and Pau Rodriguez
    In WACV 2023 (Oral), 2023
  4. figgen.png
    FigGen: Text to Scientific Figure Generation
    Juan Rodriguez, David Vazquez, Issam Laradji, Marco Pedersoli, and Pau Rodriguez
    In ICLR 2023 (Tiny paper track), 2023
  5. toobig.png
    Too Big to Fool: Resisting Deception in Language Models
    Mohammad Reza Samsami, Mats Leon Richter, Juan Rodriguez, Megh Thakkar, Sarath Chandar, and Maxime Gasse
    In preprint, 2024
  6. insightbench.png
    InsightBench: Evaluating Business Analytics Agents Through Multi-Step Insight Generation
    Gaurav Sahu, Abhay Puri, Juan Rodriguez, Amirhossein Abaskohi, Mohammad Chegini, Alexandre Drouin, Perouz Taslakian, Valentina Zantedeschi, Alexandre Lacoste, David Vazquez, and 4 more authors
    2024
  7. intentgpt.png
    IntentGPT: Few-shot Intent Discovery with Large Language Models
    Juan Rodriguez, Nicholas Botzer, David Vazquez, Christopher Pal, Marco Pedersoli, and Issam Laradji
    2024