Blog

CONV-TO-BENCH: Evaluating Language Models via User–Assistant Dialogues in Code Tasks

Workshop Paper · April 2026

CONV-TO-BENCH creates evaluation benchmarks from raw user–assistant interaction logs in code tasks, enabling model assessment on realistic multi-turn dialogue data. Read more

CONV-TO-BENCH: Evaluating Language Models via User–Assistant Dialogues in Code Tasks

Partial Reasoning in Language Models: Search and Refinement Guided by Uncertainty

Workshop Paper · January 2026

We propose partial reasoning in language models guided by uncertainty, using search and refinement to selectively truncate chain-of-thought while preserving performance. Read more

Partial Reasoning in Language Models: Search and Refinement Guided by Uncertainty

Do Reasoning Models Ask Better Questions? A Formal Information-Theoretic Analysis on Multi-Turn LLM Games

Workshop Paper · January 2026

A formal information-theoretic analysis on multi-turn LLM games evaluating whether reasoning models ask better questions than standard models. Read more

Do Reasoning Models Ask Better Questions? A Formal Information-Theoretic Analysis on Multi-Turn LLM Games

Learning Without Critics? Revisiting GRPO in Classical Reinforcement Learning Environments

Workshop Paper · November 2025

Theoretical and empirical study revisiting GRPO in classical RL environments, comparing it against PPO and analyzing the role of the critic. Read more

Learning Without Critics? Revisiting GRPO in Classical Reinforcement Learning Environments

Personalizing Fairness: Adaptive RL with User Diversity Preference for Recommender Systems

Workshop Paper · August 2025

We propose an adaptive RL approach for recommender systems that personalizes fairness based on user diversity preferences. Read more

Personalizing Fairness: Adaptive RL with User Diversity Preference for Recommender Systems

Reinforcement Learning for Debt Pricing: A Case Study in Financial Services

Workshop on RL4RS @ RLC 2025 · June 2025

Offline RL with LTV-based rewards and bandit orchestration at a large financial institution improved collection values. Read more

Reinforcement Learning for Debt Pricing: A Case Study in Financial Services

Sliding Puzzles Gym: A Scalable Benchmark for State Representation in Visual Reinforcement Learning

ICML 2025 Poster · May 2025

SPGym extends the 8-tile puzzle to evaluate RL agents by scaling representation learning complexity while keeping environment dynamics fixed, revealing opportunities for advancing representation learning for decision-making research. Read more

Sliding Puzzles Gym: A Scalable Benchmark for State Representation in Visual Reinforcement Learning

InfoQuest: Evaluating Multi-Turn Dialogue Agents for Open-Ended Conversations with Hidden Context

Workshop on SSI-FM @ ICLR 2025 · March 2025

A benchmark for evaluating how LLMs handle ambiguous open-ended requests through dialogue, revealing that current models struggle to ask effective clarifying questions. Read more

InfoQuest: Evaluating Multi-Turn Dialogue Agents for Open-Ended Conversations with Hidden Context

Benchmarking Open-Source LLMs as Model Evaluators

Research Paper · October 2024

We evaluate open-source LLMs against proprietary models using benchmarks for instruction adherence and positional bias, finding that open models are closing the performance gap with GPT-4, though GPT-4 still leads in overall consistency and fairness. Read more

Benchmarking Open-Source LLMs as Model Evaluators

Data-Driven Debt Pricing: A Systematic Literature Review

Research Paper · February 2023

This review explores the potential of machine learning in debt pricing, with a focus on reinforcement learning. It concludes that more research is needed and highlights issues with reproducibility and comparability of results. Read more

Data-Driven Debt Pricing: A Systematic Literature Review

PulseRL: Enabling Offline Reinforcement Learning for Digital Marketing Systems via Conservative Q-Learning

Workshop on Offline RL @ NeurIPS 2021 · October 2021

PulseRL is an offline reinforcement learning system for optimizing communication channels in Digital Marketing Systems (DMS) using Conservative Q-Learning (CQL). It learns from historical data, avoiding costly interactions, and reduces bias from out-of-distribution actions. PulseRL outperformed RL baselines in real-world DMS experiments, proving its effectiveness at scale. Read more

PulseRL: Enabling Offline Reinforcement Learning for Digital Marketing Systems via Conservative Q-Learning

Multiagent Soccer Environment for Python

Reinforcement Learning Environment · September 2021

A pre-compiled Soccer-Twos environment with multi-agent Gym-compatible wrappers and a human-friendly visualizer. Built on top of Unity ML Agents to be used as final assignment for the Reinforcement Learning Minicourse at CEIA / Deep Learning Brazil. Read more

Multiagent Soccer Environment for Python

Cellular Automata Framework

Project · March 2021

A Cellular Automata program built with C++, OpenGL, CUDA and OpenMP. The main objective of this project is to allow scaling up to a reasonably large number of cells while maintaining the code legibility and allowing for further customisations. Read more

Intrinsic motivation for robotic manipulation learning with sparse rewards

Undergraduate Thesis · December 2019

Intrinsic motivation for robotic manipulation learning with sparse rewards - Study of the impact of curiosity and intrinsic motivation as an exploration strategy for deep reinforcement learning agents on sparse-reward robotic manipulator environments. Read more

Bone Age Regression

Deep Learning · November 2019

This is my code for the I2A2 Bone Age Regression competition. I learned a lot by building this pipeline from scratch and experimenting with different model architectures and optimizers. This was my first end-to-end image regression model, and it was very nice seeing my theoretical knowledge work in practice. Read more

Quack

Game · January 2019

Quack is a Unity3D game made for the Global Game Jam 2019 themed "What home means to you?". The game consists of a happy chicken that wants to build a new home for its children. You have to collect sticks and group them on top of the main tree to make a lovely nest. This game was developed within 12 hours. Read more

3D Rendering & Force Simulator

Rendering · December 2018

3D Force simulator using only Processing's point() and line functions. Uses Digital Differential Analyzer (DDA) to render lines between two points, Scan Line to render polygons, normal calculation to determine faces to render in 3D space and Newtonian physics. Written in Java. Read more

IEEE VSSS Team

Robotics · October 2018

A stack consisting of image processing, computer vision, team coordination, navigation, control and communication software to compete in the 2018's Latin-American Robotics Competition for the Pequi Mecânico UFG - INF's team. Read more

Die Zombit

Game · June 2015

Reviving the classics of the 80's and 90's, Die Zombit is a retrowave top-down shooting game that has a striking soundtrack and an addictive gameplay which guarantee many hours of fun. Read more