level: research
quantum computing still operates in the noisy intermediate-scale quantum era, where noise limits performance. addressing this requires hardware-facing capabilities like mid-circuit measurement, classical feedback for quantum error correction, precise timing for dynamical decoupling, and pulse-level waveform access. openqasm-3 was created to expose these features, offering a hardware-level programming interface. however, no dataset existed to train or evaluate large language models on openqasm-3 programs that use these advanced features.
qasm-eval fills this gap as the first comprehensive dataset for openqasm-3. it includes programs that go beyond basic gate sequences, covering error correction, timing control, and calibration tasks. the dataset is designed to help llms generate and understand code that interacts directly with quantum hardware, which is essential for practical quantum computing. it provides a benchmark for measuring how well models handle the complexities of real-world quantum programming.
the dataset enables researchers to fine-tune and test llms on realistic quantum computing tasks. by focusing on hardware-level details, it pushes models to handle low-level control and feedback loops. this can lead to better automation in quantum experiments and faster development of error mitigation techniques. qasm-eval sets a new standard for evaluating code generation in the quantum domain, moving beyond abstract circuit design to concrete hardware interaction.
why it matters: it provides a needed benchmark for llms on practical quantum programming, helping automate hardware-level tasks and improve error correction.