Leading like a coach is a powerful approach for executives to help employees achieve goals and boost performance. Successful leader-coaches rely on strong communication skills—they ask great questions, resolve conflicts, and give meaningful feedback. But not all leaders naturally have these coaching abilities.

Having worked with many executives on their coaching skills, we’ve found that one of the best ways to improve is by having a human trainer observe their conversations and provide feedback. With large language models (LLMs) now capable of “conversing” with humans, we wondered: Could AI tools become effective practice partners for executive coaching?
To test this, we built a GPT-4-powered tool that gives feedback like a human coach. We tested it with 167 executives across various industries and found that AI coaching can be an efficient, accessible, and powerful way to strengthen leadership skills—and it’s available to nearly everyone.
Two Ways AI Can Help Leaders Have Better Conversations
Executives can use these tools in two key ways: by asking for advice or asking for assessment.
First, they can treat AI like a leadership coach—asking for help preparing for tough conversations or analyzing past ones. They can also role-play with the tool to explore how different responses might play out.
For example, imagine an executive unsure how to follow up in a fast-moving situation with a team member. A useful first step is writing a brief (250-word) summary of the scenario. Just putting thoughts into words brings clarity. They can then input this into the tool or speak directly to it, explaining the situation and prompting it to act as a coach. Open-ended questions like “What would you advise as a coach?” or “How can I handle this productively?” work well. More specific prompts—“Which communication style fits this case?” or “How do I apply supportive communication concisely?”—yield sharper answers.
The second approach? Have AI listen to a real conversation, then assess the executive’s communication style afterward. Specific questions lead to better insights. For example, asking “Was my word choice appropriate?” or “Were the dynamics productive?” taps into the tool’s ability to analyze the entire exchange. If the AI is trained on a framework (like Heron’s coaching styles), executives can even ask for the “best communication style for this situation.”
Using AI to Become a Better Coach
In our tests, we had AI listen to and assess real coaching conversations—helping executives refine their style. We used a version of GPT-4 fine-tuned with pre-existing coaching dialogues analyzed using John Heron’s framework. Heron’s model breaks interactions into six styles:
- Prescriptive (giving advice)
- Informative (sharing knowledge)
- Catalytic (encouraging self-discovery)
- Cathartic (exploring emotions)
- Supportive (offering encouragement)
- Confronting (challenging assumptions)
Executives practiced coaching each other using real career scenarios while our tool “listened.” Afterward, the coach could ask the AI questions like:
- What’s my coaching style per Heron’s framework?
- Which styles do I use most—and does it work?
- How effective was I as a coach?
- What went particularly well?
- What patterns do you see in my questioning?
- How could I improve?
- Do I dominate conversations? If so, how?
- What blind spots do you notice in my style?
For comparison, human observers also watched and documented these sessions.
The Expected, the Unexpected, and the Surprising
The tool accurately identified Heron’s coaching styles and gave situationally relevant feedback—aligning closely with human observers’ assessments.
One Scandinavian executive said: “I was skeptical at first but soon saw how this could measure leadership outcomes. Tech can truly enhance our skills.” A Japanese leader noted: “I learned AI does far more than crunch data.” A Moroccan executive called the tool “incredibly practical—like having a personal coach on demand.”
Most executives found the AI’s feedback useful, sometimes even surprising. In one case, the tool detected an approach the leader considered but didn’t use—highlighting it as an alternative in the feedback.
Unsurprisingly, specific prompts (“Give me concrete examples”) delivered the most valuable insights.
How Executives Rated the AI’s Feedback
We organized their experiences into a framework called NU—measuring how Novel and Useful the feedback was. This created four quadrants:
- Zone of Validation (30%)
- Feedback matched what executives already knew.
- While reassuring, this risks complacency—the challenge is spotting subtler growth opportunities.
- Zone of Learning (55%)
- Feedback was surprising and useful, sparking new insights.
- Executives appreciated the AI’s “granular attention” (e.g., noting “that question was too aggressive”).
- Zone of Irritation (10%)
- Feedback felt off-target or unhelpful.
- Some wanted more practical guidance. Irritation, though uncomfortable, can prompt reflection.
- Zone of Indifference (5%)
- Feedback was neither surprising nor useful.
- This suggests a need for sharper questions or clearer goals.
Limitations to Keep in Mind
- Privacy concerns: AI needs access to conversations, so data security is critical.
- Hallucinations: AI sometimes gives inaccurate feedback. Grounding prompts in frameworks (like Heron’s) helps spot these.
- Cultural/emotional gaps: AI may miss nuances, so it’s best paired with human insight.
Key Takeaways for Leaders
- Use structured frameworks (e.g., Heron’s) and ask precise questions to get actionable feedback.
- Stay open to surprises—AI might reveal blind spots you’d miss otherwise. Many found it easier to hear tough feedback from AI than humans.
- Combine AI with human perspectives for richer, more contextual insights.
- Iterate and refine: Early sessions may feel basic, but persistence leads to deeper growth.
Bonus Prompting Tips
- Ask AI to rewrite your statements in different tones (playful, assertive, etc.) to see what resonates.
- Have it flag filler words and suggest crisper alternatives.
- Role-play a conversation in one cultural context, then ask how it might differ in another—this builds adaptability.