source: techcrunch ai: thinking machines wants to build an ai that actually listens while it talks

level: technical

thinking machines lab, founded by former openai cto mira murati, introduced interaction models on monday. the core idea is an ai that can interrupt you, unlike current models that take turns. the company calls this full duplex, where the model processes input and generates output at the same time. their model, tml-interaction-small, responds in 0.40 seconds, matching natural human conversation pace. this is faster than comparable models from openai and google, according to the company.

this is a research preview, not a public product. a limited research preview is planned for the next few months, with a wider release later this year. the benchmarks are promising, and the concept of native interactivity is notable. however, real-world performance remains unverified until users can test it. the announcement focuses on technical speed and the shift from turn-based to simultaneous interaction.

current ai assistants operate in half-duplex mode, waiting for a user to finish before replying. full-duplex models could make interactions feel more like a phone call, reducing latency and awkward pauses. thinking machines lab aims to integrate this capability directly into the model architecture, rather than patching it on top. the approach could influence future voice assistants and real-time ai applications, though practical challenges like handling interruptions gracefully remain.

why it matters: faster, simultaneous ai speech could improve voice assistants and real-time translation, making interactions more natural and efficient.


source: techcrunch ai: thinking machines wants to build an ai that actually listens while it talks