Decoding AI's Shortcomings in Conversational Skills

Researchers at **Tufts University** have delved into why AI systems struggle with natural conversations compared to humans. The study led by **JP de Ruiter** and his team demonstrates that while humans rely on language content to determine speaking turns in a conversation, AI models trained on written content fall short in detecting these conversational cues, known as **transition relevant places (TRPs)**. While previous beliefs pointed to paraverbal signals—intonations, pauses, and visual cues—as key indicators, it turns out that language content holds more weight. AI systems, however, were primarily trained on written content, which is different from spoken language. This results in difficulties for AI in mimicking human-like conversational flow. Attempts to fine-tune AI with conversational content showed promise but didn't fully close the gap due to limited data. **Muhammad Umair** and **Vasanth Sarathy**, co-authors of the study, emphasize the challenge of gathering enough conversational data to train AI effectively. As AI continues to rely on statistical correlations, the depth of context understanding remains a hurdle.