publications

In reversed chronological order.

2023

  1. Neural networks and the chomsky hierarchy
    Grégoire Delétang, Anian Ruoss, Jordi Grau-Moya, Tim Genewein, Li Kevin Wenliang, Elliot Catt, Chris Cundy, Marcus Hutter, Shane Legg, Joel Veness, and P.A. Ortega
    In International Conference on Learning Representations (ICLR), 2023

2022

  1. Your policy regularizer is secretly an adversary
    Rob Brekelmans, Tim Genewein, Jordi Grau-Moya, Grégoire Delétang, Markus Kunesch, Shane Legg, and Pedro Ortega
    arXiv preprint arXiv:2203.12592, 2022
  2. Beyond Bayes-optimality: meta-learning what you know you don’t know
    Jordi Grau-Moya, Grégoire Delétang, Markus Kunesch, Tim Genewein, Elliot Catt, Kevin Li, Anian Ruoss, Chris Cundy, Joel Veness, Jane Wang, and  others
    arXiv preprint arXiv:2209.15618, 2022

2021

  1. From Poincaré recurrence to convergence in imperfect information games: Finding equilibrium via regularization
    Julien Perolat, Remi Munos, Jean-Baptiste Lespiau, Shayegan Omidshafiei, Mark Rowland, Pedro Ortega, Neil Burch, Thomas Anthony, David Balduzzi, Bart De Vylder, and  others
    In International Conference on Machine Learning, 2021
  2. Agent incentives: A causal perspective
    Tom Everitt, Ryan Carey, Eric D Langlois, Pedro A Ortega, and Shane Legg
    In Proceedings of the AAAI Conference on Artificial Intelligence, 2021
  3. Causal analysis of agent behavior for ai safety
    Grégoire Déletang, Jordi Grau-Moya, Miljan Martic, Tim Genewein, Tom McGrath, Vladimir Mikulik, Markus Kunesch, Shane Legg, and Pedro A Ortega
    arXiv preprint arXiv:2103.03938, 2021
  4. Shaking the foundations: delusions in sequence models for interaction and control
    Pedro A Ortega, Markus Kunesch, Grégoire Delétang, Tim Genewein, Jordi Grau-Moya, Joel Veness, Jonas Buchli, Jonas Degrave, Bilal Piot, Julien Perolat, and  others
    arXiv preprint arXiv:2110.10819, 2021
  5. Model-free risk-sensitive reinforcement learning
    Grégoire Delétang, Jordi Grau-Moya, Markus Kunesch, Tim Genewein, Rob Brekelmans, Shane Legg, and Pedro A Ortega
    arXiv preprint arXiv:2111.02907, 2021
  6. Stochastic Approximation of Gaussian Free Energy for Risk-Sensitive Reinforcement Learning
    Grégoire Delétang, Jordi Grau-Moya, Markus Kunesch, Tim Genewein, Rob Brekelmans, Shane Legg, and Pedro A Ortega
    2021

2020

  1. Action and perception as divergence minimization
    Danijar Hafner, Pedro A Ortega, Jimmy Ba, Thomas Parr, Karl Friston, and Nicolas Heess
    arXiv preprint arXiv:2009.01791, 2020
  2. Meta-trained agents implement bayes-optimal agents
    Vladimir Mikulik, Grégoire Delétang, Tom McGrath, Tim Genewein, Miljan Martic, Shane Legg, and Pedro Ortega
    Advances in neural information processing systems, 2020
  3. Algorithms for causal reasoning in probability trees
    Tim Genewein, Tom McGrath, Grégoire Delétang, Vladimir Mikulik, Miljan Martic, Shane Legg, and Pedro A Ortega
    arXiv preprint arXiv:2010.12237, 2020

2019

  1. Bayesian optimistic kullback–leibler exploration
    Kanghoon Lee, Geon-Hyeong Kim, Pedro Ortega, Daniel D Lee, and Kee-Eung Kim
    Machine Learning, 2019
  2. Causal reasoning from meta-reinforcement learning
    Ishita Dasgupta, Jane Wang, Silvia Chiappa, Jovana Mitrovic, Pedro Ortega, David Raposo, Edward Hughes, Peter Battaglia, Matthew Botvinick, and Zeb Kurth-Nelson
    arXiv preprint arXiv:1901.08162, 2019
  3. Understanding agent incentives using causal influence diagrams. Part I: Single action settings
    Tom Everitt, Pedro A Ortega, Elizabeth Barnes, and Shane Legg
    arXiv preprint arXiv:1902.09980, 2019
  4. Meta-learning of sequential strategies
    Pedro A Ortega, Jane X Wang, Mark Rowland, Tim Genewein, Zeb Kurth-Nelson, Razvan Pascanu, Nicolas Heess, Joel Veness, Alex Pritzel, Pablo Sprechmann, and  others
    arXiv preprint arXiv:1905.03030, 2019
  5. Meta reinforcement learning as task inference
    Jan Humplik, Alexandre Galashov, Leonard Hasenclever, Pedro A Ortega, Yee Whye Teh, and Nicolas Heess
    arXiv preprint arXiv:1905.06424, 2019
  6. Social influence as intrinsic motivation for multi-agent deep reinforcement learning
    Natasha Jaques, Angeliki Lazaridou, Edward Hughes, Caglar Gulcehre, Pedro Ortega, DJ Strouse, Joel Z Leibo, and Nando De Freitas
    In International conference on machine learning, 2019
  7. Meta-reinforcement learning of causal strategies
    Ishita Dasgupta, Zeb Kurth-Nelson, Silvia Chiappa, Jovana Mitrovic, Pedro Ortega, Edward Hughes, Matthew Botvinick, and Jane Wang
    In The Meta-Learning Workshop at the Neural Information Processing Systems (NeurIPS), 2019

2018

  1. Modeling friends and foes
    Pedro A Ortega, and Shane Legg
    arXiv preprint arXiv:1807.00196, 2018

2017

  1. AI safety gridworlds
    Jan Leike, Miljan Martic, Victoria Krakovna, Pedro A Ortega, Tom Everitt, Andrew Lefrancq, Laurent Orseau, and Shane Legg
    arXiv preprint arXiv:1711.09883, 2017

2016

  1. Memory shapes time perception and intertemporal choices
    Pedro A Ortega, and Naftali Tishby
    arXiv preprint arXiv:1604.05129, 2016
  2. Decision-making under ambiguity is modulated by visual framing, but not by motor vs. non-motor context. experiments and an information-theoretic ambiguity model
    Jordi Grau-Moya, Pedro A Ortega, and Daniel A Braun
    PloS one, 2016
  3. Bayesian Reinforcement Learning with Behavioral Feedback.
    Teakgyu Hong, Jongmin Lee, Kee-Eung Kim, Pedro A Ortega, and Daniel D Lee
    In IJCAI, 2016
  4. Human decision-making under limited time
    Pedro A Ortega, and Alan A Stocker
    Advances in Neural Information Processing Systems, 2016
  5. Memory controls time perception and intertemporal choices.
    Pedro A Ortega, and Naftali Tishby
    arXiv preprint arXiv:1604.05129, 2016

2015

  1. Belief flows for robust online learning
    Pedro A Ortega, Koby Crammer, and Daniel D Lee
    In 2015 Information Theory and Applications Workshop (ITA), 2015
  2. Reactive bandits with attitude
    Pedro Ortega, Kee-Eung Kim, and Daniel Lee
    In Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics (AISTATS)), 2015
  3. Causal reasoning in a prediction task with hidden causes.
    Pedro A Ortega, Daniel D Lee, and Alan A Stocker
    In CogSci, 2015
  4. What is epistemic value in free energy models of learning and acting? A bounded rationality perspective
    Pedro A Ortega, and Daniel A Braun
    Cognitive Neuroscience, 2015
  5. Perceptual adaptation: getting ready for the future
    Xue-Xin Wei, Pedro Ortega, and Alan Stocker
    Journal of Vision, 2015
  6. Information-Theoretic Bounded Rationality
    Pedro A. Ortega, Daniel A. Braun, Justin S. Dyer, Kee-Eung Kim, and Naftali Tishby
    2015

2014

  1. Monte Carlo methods for exact & efficient solution of the generalized optimality equations
    Pedro A Ortega, Daniel A Braun, and Naftali Tishby
    In 2014 IEEE International Conference on Robotics and Automation (ICRA), 2014
  2. Dynamic belief state representations
    Daniel D Lee, Pedro A Ortega, and Alan A Stocker
    Current opinion in neurobiology, 2014
  3. Generalized Thompson sampling for sequential decision-making and causal inference
    Pedro A Ortega, and Daniel A Braun
    Complex Adaptive Systems Modeling, 2014
  4. Subjectivity, Bayesianism, and Causality
    Pedro A. Ortega
    arXiv preprint arXiv:1407.4139, 2014
  5. An Adversarial Interpretation of Information-Theoretic Bounded Rationality
    Pedro A. Ortega, and Daniel D. Lee
    In Twenty-Eighth AAAI Conference on Artificial Intelligence (AAAI ’14), 2014
  6. Information-theoretic bounded rationality and \varepsilon-optimality
    Daniel A Braun, and Pedro A Ortega
    Entropy, 2014
  7. Ellsberg’s paradox in sensorimotor learning
    Daniel A Braun, Jordi Grau-Moya, and Pedro A Ortega
    In Theoretical and Empirical Research in Decision-Making (DMB 2014), 2014

2013

  1. Metabolic cost as an organizing principle for cooperative learning
    David Balduzzi, Pedro A Ortega, and Michel Besserve
    Advances in Complex Systems, 2013
  2. Thermodynamics as a theory of decision-making with information-processing costs
    Pedro A Ortega, and Daniel A Braun
    Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences, 2013

2012

  1. Free energy and the generalized optimality equations for sequential decision making
    Pedro A Ortega, and Daniel A Braun
    arXiv preprint arXiv:1205.3997, 2012
  2. A nonparametric conjugate prior distribution for the maximizing argument of a noisy function
    Pedro Ortega, Jordi Grau-Moya, Tim Genewein, David Balduzzi, and Daniel Braun
    Advances in Neural Information Processing Systems, 2012
  3. Risk-sensitivity in Bayesian sensorimotor integration
    Jordi Grau-Moya, Pedro A Ortega, and Daniel A Braun
    2012
  4. Free Energy & Bounded Rationality
    Pedro A Ortega, and Daniel A Braun
    In Workshop on The Statistical Physics of Inference and Control Theory, 2012
  5. Adaptive coding of actions and observations
    Pedro A Ortega, and Daniel A Braun
    In NIPS Workshop on Information in Perception and Action 2012, 2012
  6. Thermodynamics as a theory of bounded rational decision-making
    DA Braun, and PA Ortega
    In Workshop on Statistical Physics of Inference and Control Theory, 2012

2011

  1. Path integral control and bounded rationality
    Daniel A Braun, Pedro A Ortega, Evangelos Theodorou, and Stefan Schaal
    In 2011 IEEE symposium on adaptive dynamic programming and reinforcement learning (ADPRL), 2011
  2. Motor coordination: when two have to act as one
    Daniel A Braun, Pedro A Ortega, and Daniel M Wolpert
    Experimental brain research, 2011
  3. Reinforcement learning and the Bayesian control rule
    Pedro Alejandro Ortega, Daniel Alexander Braun, and Simon Godsill
    In Artificial General Intelligence: 4th International Conference, AGI 2011, Mountain View, CA, USA, August 3-6, 2011. Proceedings 4, 2011
  4. Information, utility and bounded rationality
    Daniel Alexander Ortega, and Pedro Alejandro Braun
    In Artificial General Intelligence: 4th International Conference, AGI 2011, Mountain View, CA, USA, August 3-6, 2011. Proceedings 4, 2011
  5. A unified framework for resource-bounded autonomous agents interacting with unknown environments
    Pedro Alejandro Ortega
    University of Cambridge, 2011
  6. Bayesian causal induction
    Pedro A Ortega
    NIPS Workshop on Philosopy and Machine Learning, 2011

2010

  1. A minimum relative entropy principle for learning and acting
    Pedro A Ortega, and Daniel A Braun
    Journal of Artificial Intelligence Research, 2010
  2. A minimum relative entropy principle for adaptive control in linear quadratic regulators
    Daniel A Braun, and Pedro A Ortega
    In International Conference on Informatics in Control, Automation and Robotics, 2010
  3. An axiomatic formalization of bounded rationality based on a utility-information equivalence
    Pedro A Ortega, and Daniel A Braun
    arXiv preprint arXiv:1007.0940, 2010
  4. A Minimum Relative Entropy Controller for Undiscounted Markov Decision Processes
    Pedro A Ortega, and Daniel A Braun
    arXiv preprint arXiv:1002.1480, 2010
  5. Convergence of Bayesian Control Rule
    Pedro A Ortega, and Daniel A Braun
    arXiv preprint arXiv:1002.3086, 2010

2009

  1. A bayesian rule for adaptive control based on causal interventions
    Pedro A Ortega, and Daniel A Braun
    arXiv preprint arXiv:0911.5104, 2009
  2. Nash equilibria in multi-agent motor interactions
    Daniel A Braun, Pedro A Ortega, and Daniel M Wolpert
    PLoS computational biology, 2009
  3. A conversion between utility and information
    Pedro A Ortega, and Daniel A Braun
    In Proceedings of the Third Conference on Artificial General Intelligence, 2009

2006

  1. A Medical Claim Fraud/Abuse Detection System based on Data Mining: A Case Study in Chile.
    Pedro A Ortega, Cristián J Figueroa, and Gonzalo A Ruz
    DMIN, 2006