Week 2 AI Showdown: Which Platform Wrote the Best Prompt Post?

  • Metadata

    Topic: Role Assignment in AI Prompts and Why One Sentence Changes Everything

    Week: Week 2

    Rubric version: v2.0

    Platforms compared: ChatGPT, Gemini, Claude

    Winner: Claude (87.5 / 100)

    Runner-up: ChatGPT (80.0 / 100)

    Third place: Gemini (76.5 / 100)

    Margin of victory: 7.5 points

    Tags: ai-comparison, prompt-engineering, chatgpt-vs-claude-vs-gemini, weekly-showdown, ai-quality, rubric, role-assignment, week-2

    Categories: AI Comparison, Prompt Engineering

    Estimated reading time: 8-10 minutes

    SEO title: Week 2 AI Showdown: Claude vs ChatGPT vs Gemini — Role Assignment Prompts Compared

    SEO description: We scored all three AI platforms on role assignment prompts using a 7-dimension rubric. See the evidence, the scores, and the Week 2 winner.

Every week, Ketelsen.ai runs the same prompt topic through three of the biggest AI platforms on the planet — ChatGPT, Gemini, and Claude — and publishes all three results side by side. Same topic, same template, same rules. The only variable is the AI doing the thinking. This is Week 2, and the topic hit a nerve that every AI user has felt but few know how to fix: role assignment. That one sentence you type before your actual question — "You are a senior marketing strategist" or "Act as a cybersecurity architect" — that turns out to be the single highest-leverage improvement most people never make. All three platforms took their shot at explaining why it works, how to do it well, and what happens when you push the technique to its limits. The scores were closer this week, but one platform still pulled ahead.

The Topic: Role Assignment in AI Prompts

This week's topic asked each AI platform to produce a complete blog post featuring three prompt variations — Beginner, Intermediate, and Advanced — all built around the same idea: assigning the AI a specific expert role before giving it a task. The topic matters because role assignment is the bridge between getting generic AI output and getting output that sounds like it came from someone who actually works in your field.

How We Score: The 7-Dimension Quality Rubric

We do not pick winners by gut feeling. Every comparison post on Ketelsen.ai uses a structured rubric that scores each platform across seven dimensions, each weighted by how much it matters to you — the reader. The dimensions are not abstract quality labels; they measure whether the post actually does its job. Can you copy the prompt and use it today? Does the breakdown teach you something transferable? Are the industry examples specific enough that someone in that field would recognize their own problem? Those are the questions behind the numbers.

Each dimension is scored on a 1-to-10 scale with specific anchor descriptions at every level, so a "7" means the same thing whether we are scoring ChatGPT or Claude. The dimension scores are then weighted by importance and normalized to a 0-to-100 overall score. If two platforms land within 3 points of each other, we call it a tie and explain the trade-offs instead of forcing a winner. The rubric is version 2.0 — it will continue to evolve as we discover new ways to measure what makes one AI post genuinely better than another.

Dimension Weight What It Measures
D1: Prompt Quality 20% Are the prompts well-engineered, genuinely usable, and clearly differentiated across difficulty levels?
D2: Breakdown Clarity 15% Does the prompt breakdown teach you WHY each part matters, not just what it says?
D3: Industry Examples 15% Are the practical examples specific enough that a professional in that field would say "that is exactly my problem"?
D4: Writing Quality 15% Does the writing match the Ketelsen.ai voice — fun, informative, accessible, and publication-ready?
D5: Creative Use Cases 10% Do the suggested use cases go beyond the obvious — would the reader think "I never would have considered that"?
D6: Actionability 15% Can the reader immediately USE what they read — tools, checklists, workflows they walk away with?
D7: Completeness 10% Does the post cover all template sections fully without filler or thin spots?

Platform-by-Platform Breakdown

Claude: 87.5 / 100

Strengths

Claude delivered the deepest and most technically sophisticated post of the three at 174,674 characters — more than three times Gemini's output and nearly double ChatGPT's. The standout is in prompt engineering depth. The Advanced variation — titled "The Full Persona Architecture" — is not just a better prompt; it is a six-section persona specification with identity, knowledge base, decision-making philosophy, communication parameters, limitations, and output standards. No other platform attempted anything close to that level of structural ambition. The Intermediate variation introduces what Claude calls "niche stacking — the more precisely you define the intersection of expertise and domain, the more the AI behaves like a true specialist rather than a well-read generalist" — a concept that teaches a transferable mental model, not just a one-time technique.

The prompt breakdowns are where Claude separates most clearly from the field. Every breakdown segment ends with an italicized transferable principle that the reader can apply to any future prompt, not just the one being analyzed. The writing voice opens strong — "Here is something that will change every AI conversation you have from this point forward — and it takes fewer than ten words to do it" — and sustains that energy across the full post without filler. Creative use cases are the widest-ranging of all three platforms at 15 total, including tabletop gaming persona design, therapy and coaching protocols, and governance policy drafting.

The follow-up prompts are strategically sequenced — personal application first, then team standardization, then organizational systems — giving the reader a progression path, not just a next step.

Weaknesses

Citations are listed as "NOT APPLICABLE" across all three variations, which means the post provides zero external credibility anchors — a gap that ChatGPT exploits with real academic references. At 174,674 characters, the sheer length may overwhelm casual readers; there is no executive summary or quick-start guide to help someone short on time find the right entry point. The Advanced variation's six-section persona architecture is powerful but could benefit from a worked example showing the completed template filled in for a specific role, rather than leaving every field as a bracket placeholder.

Signature Move

Claude treats role assignment as an architectural discipline rather than a tips-and-tricks exercise — and the result reads like a professional methodology that compounds in value every time you use it.


ChatGPT: 80.0 / 100

Strengths

ChatGPT delivered the most methodologically grounded post of the three — and it is the only platform that provided real academic citations. Every variation references OpenAI's prompt engineering guidance, Anthropic's best practices, Google's prompt design strategies, and a 2024 EMNLP paper by Zheng et al. that directly challenges the assumption that personas always improve model performance. That level of intellectual honesty is rare in a prompt guide and gives the post genuine credibility with skeptical readers.

The Advanced variation — "The Multi-Role Contrast and Self-Check Workflow" — introduces a six-step process where the AI identifies three candidate roles, evaluates what each would emphasize and overlook, selects the best one, completes the task, and then reviews its own output against success criteria. The breakdown for this variation is the most instructive of any single variation across all three platforms, with transferable principles like "when a hidden decision affects output quality, promote it into the prompt" and "compare widely, execute narrowly."

The writing voice is clean and professional-casual throughout, opening with a relatable hook about getting bland AI responses and building momentum through each variation. At 108,780 characters, the post is substantial without feeling bloated.

Weaknesses

The industry examples, while competent, are fewer and less vivid than Claude's or Gemini's — 9 total across all variations compared to Gemini's 18. The creative use cases include some pleasant surprises (neighborhood peacemaker, wedding speech coach) but fewer truly unexpected angles than Claude's range. The Beginner variation's prompt is the most complex of the three platforms' beginner offerings — a six-element structure that may actually be intermediate-level for a true AI newcomer.

Signature Move

ChatGPT is the only platform that treats its own claims with academic rigor — citing research that actually challenges the technique it is teaching, which makes the post more trustworthy, not less.


Gemini: 76.5 / 100

Strengths

Gemini delivers the most distinctive prompt naming of the three platforms — "The Expert Consultant," "The Audience-Tuned Specialist," and "The Synthetic Board Member" each communicate both personality and function immediately. The Advanced variation's "Synthetic Board Member" concept is the most dramatically original idea across all three posts: a Chief Strategy Officer who uses First Principles thinking to deconstruct business initiatives, red-team them for fatal flaws, build a mitigation matrix, and deliver a go/no-go verdict. The breakdown explains that AI models are "fundamentally RLHF-trained to be helpful, agreeable, and polite, which often leads to sycophancy" — and that explicitly instructing the AI to be brutal overrides that tendency. That is a genuinely educational insight about how language models work.

The industry examples are the most numerous at 18 total (6 per variation) and include distinctive picks like Freelance Graphic Design, Manufacturing and Supply Chain, and High-End Hospitality. The Fintech example in the Advanced variation — where the synthetic CSO identifies that a B2C-to-B2B pivot is flawed because banks resist third-party APIs and suggests targeting credit unions instead — is the single most specific and actionable industry example across all three posts.

Weaknesses

At 55,098 characters, Gemini's post is roughly a third the size of Claude's and half of ChatGPT's. That size difference shows up in the scoring: the Beginner variation's Pro Tips section is listed as "NOT APPLICABLE," the prompt breakdowns are competent but more descriptive than instructive — they tell you what each part does without fully explaining the transferable principle behind it — and the expanded sections feel thinner throughout. Citations are listed as "NOT APPLICABLE" across all three variations. The overall impression is of a post that has genuinely creative ideas but does not develop them with the same depth as the other two platforms.

Signature Move

Gemini has the best naming instincts and the most dramatically original Advanced concept of the three — if depth matched ambition, this would be a much closer race.


The Verdict

Claude wins Week 2 with a score of 87.5 out of 100, ahead of ChatGPT at 80.0 and Gemini at 76.5. The margin is 7.5 points over the runner-up — decisive but closer than Week 1's 18-point gap. Claude's advantages are concentrated in prompt engineering sophistication (the Full Persona Architecture is in a class of its own), breakdown clarity (every segment ends with a transferable principle), writing quality (sustained energy across 174,674 characters), and the strategic sequencing of follow-up prompts that build from individual to organizational application.

That said, this week's results tell a more nuanced story than the scores alone suggest. ChatGPT's academic citations — including a paper that challenges the very technique being taught — demonstrate a level of intellectual honesty that neither Claude nor Gemini matched. ChatGPT's Advanced variation, with its six-step self-check workflow, is arguably the most practically useful single prompt of the nine produced this week. Gemini's "Synthetic Board Member" is the most dramatically original concept across all three posts, and its industry examples are the most numerous and specific. Each platform brought something the others did not.

What This Means for You

If you want the deepest education on role assignment as a system — how to build persistent personas, layer communication styles, and architect multi-dimensional expert identities — start with the Claude version, especially the Intermediate and Advanced variations. If you want the most intellectually rigorous treatment with real citations and a self-checking workflow you can use in high-stakes situations, go with ChatGPT. If you want the fastest on-ramp with vivid naming and a dramatically original Advanced concept, Gemini's "Synthetic Board Member" is worth reading even if you start elsewhere. All three posts are published on Ketelsen.ai — read the one that matches your experience level, or read all three and draw your own conclusions.


Score Summary

Dimension Weight Claude ChatGPT Gemini
D1: Prompt Quality 20% 9 8 8
D2: Breakdown Clarity 15% 9 9 8
D3: Industry Examples 15% 8 7 8
D4: Writing Quality 15% 9 8 7
D5: Creative Use Cases 10% 8 7 7
D6: Actionability 15% 9 8 8
D7: Completeness 10% 9 9 7
OVERALL SCORE (0-100) 87.5 80.0 76.5

Source: Rubric scoring data (v2.0, 1-10 scale, weighted by dimension importance)

Visual Comparison

Claude

D1 Prompts
9
D2 Breakdown
9
D3 Examples
8
D4 Writing
9
D5 Creative
8
D6 Action
9
D7 Complete
9

ChatGPT

D1 Prompts
8
D2 Breakdown
9
D3 Examples
7
D4 Writing
8
D5 Creative
7
D6 Action
8
D7 Complete
9

Gemini

D1 Prompts
8
D2 Breakdown
8
D3 Examples
8
D4 Writing
7
D5 Creative
7
D6 Action
8
D7 Complete
7

Methodology Note

This comparison uses Rubric v2.0 — the second version of the Ketelsen.ai cross-platform scoring system, updated from v1.0 with a 1-to-10 scale (anchors at 2/4/6/8/10, odd scores for intermediate quality), a pre-scoring calibration step that reads all three posts side by side before assigning any scores, and a widened statistical tie threshold of 3 points. The seven dimensions and their weights reflect our best current judgment about what makes a prompt-focused blog post genuinely useful to a reader. We expect the rubric to continue evolving. If you have opinions about what should be measured differently, we want to hear them — the whole point of publishing the methodology is to make it better.

Every score in this post is backed by specific evidence cited from the original blog posts. All three Week 2 posts are published on Ketelsen.ai so you can read them yourself, apply your own criteria, and decide whether you agree with our call. Transparency is the point. If the rubric is good, the verdict should be obvious to anyone who reads all three posts. If it is not obvious, the rubric needs work — and we will fix it.

Next
Next

Gemini :: How One Sentence Upgrades AI From Assistant to Expert