| Thursday, Apr 23 |
| Adaptive Attacks on Trusted Monitors Subvert AI Control Protocols |
Mikhail Terekhov, Alexander Panfilov, Daniil Dzenhaliou, Caglar Gulcehre, Maksym Andriushchenko, Ameya Prabhu, Jonas Geiping
|
| Selective Rotary Position Embedding |
Sajad Movahedi, Arshia Afzal, Timur Carstensen, Frank Hutter, Antonio Orvieto, Volkan Cevher |
| How does the optimizer implicitly bias the model merging loss landscape? |
Chenxiang Zhang, Alexander Theus, Damien Teney, Antonio Orvieto, Jun Pang, Sjouke Mauw
|
| Sample Smart, Not Hard: Correctness-First Decoding for Better Reasoning in LLMs |
Xueyan Li, Guinan Su, Mrinmaya Sachan, Jonas Geiping
|
|
RigidSSL: Rigidity-based Geometric Pretraining for Protein Generation
|
Zhanghan (Tony) Ni, Yanjing Li, Zeju Qiu, Bernhard Schölkopf, Hongyu Guo, Weiyang Liu, Shengchao Liu
|
|
Proper Velocity Neural Networks
|
Ziheng Chen, Zihan Su, Bernhard Schölkopf, Nicu Sebe
|
|
Learning Nonlinear Causal Reductions to Explain Reinforcement Learning Policies
|
Armin Kekić, Jan Schneider, Dieter Büchler, Bernhard Schölkopf, Michel Besserve
|
|
Improving LLM-Based Global optimization with Search Space Partitioning
|
Andrej Schwanke, Lyubomir Ivanov, David Salinas, Fabio Ferreira, Aaron Klein, Frank Hutter, Arber Zela
|
| Friday, Apr 24 |
| Capability-Based Scaling Laws for LLM Red-Teaming |
Alexander Panfilov, Paul Kassianik, Maksym Andriushchenko, Jonas Geiping
|
| The Curious Case of In-Training Compression of State Space Models |
Makram Chahine, Philipp Nazari, Daniela Rus, T. Konstantin Rusch
|
| Scaling Behavior of Discrete Diffusion Language Models |
Dimitri von Rütte, Janis Fluri, Omead Pooladzandi, Bernhard Schölkopf, Thomas Hofmann, Antonio Orvieto
|
| Low-Pass Filtering Improves Behavioral Alignment of Vision Models |
Max Wolff, Thomas Klein, Evgenia Rusak, Felix A. Wichmann, Wieland Brendel
|
| Training Dynamics Impact Post-Training Quantization Robustness |
Albert Catalan-Tatjer, Niccolò Ajroldi, Jonas Geiping
|
| The Illusion of Diminishing Returns: Measuring Long Horizon Execution in LLMs |
Akshit Sinha, Arvindh Arun, Shashwat Goel, Steffen Staab, Jonas Geiping
|
|
MATH-Beyond: A Benchmark for RL to Expand Beyond the Base Model
|
Prasanna Mayilvahanan, Ricardo Dominguez-Olmedo, Thaddäus Wiedemer, Wieland Brendel
|
| Saturday, Apr 25 |
| Neural Sum-of-Squares: Certifying the Nonnegativity of Polynomials with Transformers |
Nico Pelleriti, Christoph Spiegel, Shiwei Liu, David Martinez-Rubio, Max Zimmer, Sebastian Pokutta |
| GPTailor: Large Language Model Pruning Through Layer Cutting and Stitching |
Guinan Su, Li Shen, Lu Yin, Shiwei Liu, Yanwu Yang, Jonas Geiping
|
|
Pitfalls in Evaluating Language Model Forecasters
|
Daniel Paleka, Shashwat Goel, Jonas Geiping, Florian Tramèr
|
|
Skill learning via policy diversity yields identifiable representations for reinforcement learning
|
Patrik Reizinger, Bálint Mucsányi, Siyuan Guo, Benjamin Eysenbach, Bernhard Schölkopf, Wieland Brendel
|
| Strategic Dishonesty Can Undermine AI Safety Evaluations of Frontier LLMs |
Alexander Panfilov, Evgenii Kortukov, Kristina Nikolić, Matthias Bethge, Sebastian Lapuschkin, Wojciech Samek, Ameya Prabhu, Maksym Andriushchenko, Jonas Geiping
|
| Monitoring Decomposition Attacks in LLMs with Lightweight Sequential Monitors |
Chen Yueh-Han, Nitish Joshi, Yulin Chen, Maksym Andriushchenko, Rico Angell, He He
|