Week 6 AI Showdown :: Claude vs. ChatGPT vs. Gemini :: Getting Your Money Right Before You Shop
-
Metadata
Topic: New vs. Certified Pre-Owned: Let AI Make the Case
Week: Week 2
Series: AI at the Dealership: 7 Weeks of Prompts That Could Save You Thousands
Rubric version: v2.0
Platforms compared: ChatGPT, Gemini, Claude
Result: Statistical Tie — ChatGPT (88.5) and Claude (87.5), margin 1.0 point
Third place: Gemini (81.0 / 100)
Margin to third: 6.5 points
Tags: ai-comparison, prompt-engineering, chatgpt-vs-claude-vs-gemini, weekly-showdown, ai-quality, rubric, new-vs-cpo, car-buying, week-2
Categories: AI Comparison, Prompt Engineering
Estimated reading time: 12 minutes
SEO title: Week 2 AI Showdown: ChatGPT vs Claude vs Gemini on New vs CPO Car Buying
SEO description: We ran the same new-vs-CPO car buying prompts through ChatGPT, Claude, and Gemini, then scored the results across 7 dimensions. Week 2 ends in the series' first statistical tie.
Claude vs. ChatGPT vs. Gemini: Getting Your Money Right Before You Shop
Cross-Platform Showdown — Week 3 of the AI at the Dealership Series
Every Monday, Ketelsen.ai runs the same auto-buying prompt through Claude, ChatGPT, and Gemini, then grades the outputs against a seven-dimension rubric to see which platform genuinely serves the reader best. This week's topic — financing readiness — is the single highest-stakes moment of the car-buying journey. The dealer F&I office has exactly one source of leverage: your lack of preparation. A weak prompt lets the AI produce generic, hedge-padded advice that leaves you walking into the dealership exposed. A great prompt arms you with pre-approvals, trade-in valuations, negative-equity models, and negotiation scripts before the showroom doors open.
We asked all three platforms for three versions of the same prompt — Beginner, Intermediate, and Advanced — designed to walk a car shopper through the full financing-readiness checklist. The results diverged in instructive ways: one platform delivered rendered data visualizations and institutional-grade structure, another produced the deepest character-driven examples we've seen in this series, and the third leaned on accessible prose without the engineering depth the topic demanded. All three scores are honest work — this was a close race — but a winner must emerge.
This week's winner: Gemini, with a normalized score of 87.5 out of 100, edging Claude's 85.5 by two points. Gemini's edge came from three places: fully rendered SVG bar charts (no other platform rendered a single one), explicit "Transferable principle" teaching sentences inside every prompt breakdown, and "Exact input the user would provide" blockquotes that give readers literal copy-paste text. Claude was the closest second, and if this week's rubric weighted practical examples more heavily, Claude's twelve named-character case studies would have flipped the result.
This Week's Prompt Topic
Getting Your Money Right Before You Shop. A three-variation prompt set — Beginner, Intermediate, Advanced — that walks a car buyer through credit-tier identification, pre-approval mechanics, multi-lender arbitrage, trade-in disposition (including negative equity), and F&I-office negotiation scripts. This is the second post in our "AI at the Dealership" series, following Week 2's New vs. CPO analysis. The deliverable: a printable, action-ready financing readiness system the buyer can complete in a single week before ever contacting a dealership.
The prompt structure is deliberately stress-testing. Beginner level demands clear, hedge-free guidance for first-time buyers. Intermediate introduces multi-lender comparison tables, negative-equity modeling, and the 14-day rate shopping window. Advanced pushes into sensitivity analysis, Section 179 tax-shield calculations, Truth in Lending Act contract compliance, and institutional-grade arbitrage — the kind of rigor a CFO would demand before signing a capital lease. The rubric asks: which AI actually delivers the depth, actionability, and brand-voice discipline this topic deserves?
How We Scored the Three Platforms
Every weekly comparison uses the same seven-dimension rubric (v2.0), weighted to reflect what non-technical readers actually need from an AI prompt tutorial. Scores run 1-10 with 0.5-point precision, with published anchors at every even number to discipline subjective calibration. The final score is a weighted sum normalized to a 0-100 scale. A margin of 3.0 points or less triggers a tie per the recipe — this week's margin was 2.0, but Richard exercised the editorial tiebreak to award the outright win to the highest normalized score.
The Seven Dimensions
| Dimension | Weight | What It Measures |
|---|---|---|
| D1 — Prompt Quality & Engineering Depth | 20 | Is the prompt itself — the text the user copies into the AI — a piece of careful engineering? Does it use role assignment, parameter specification, conditional logic, and output-format constraints? |
| D2 — Prompt Breakdown Clarity | 15 | How well the post deconstructs each part of the prompt so the reader learns why each segment matters and can transfer the technique to their own prompts. |
| D3 — Practical Examples & Industry Relevance | 15 | Depth over presence. A single example with specific math, geography, and a named character beats five generic "imagine a small business owner" archetypes. |
| D4 — Writing Quality & Brand Voice | 15 | Does the prose match Ketelsen.ai's signature voice — hardline, expert, direct, no hedging, specific numbers over vague claims? |
| D5 — Creative Use Cases & Unexpected Angles | 10 | Anti-volume rule: four distinctive cross-domain transfers (student loans, solar financing, divorce assets) beat twelve formulaic category headers. |
| D6 — Actionability & Reader Value | 15 | Can the reader actually do something after reading? Scripts, copy-paste inputs, follow-up prompts, and concrete next steps all count. |
| D7 — Completeness & Template Adherence | 10 | All required sections present. Citations render as clickable links. Charts actually render. Metadata complete. The content ships, not ships a placeholder. |
Platform-by-Platform Breakdown
ChatGPT — 74.25 / 100 (Third Place)
ChatGPT produced the most accessible prose of the three, with a strong editorial intro (including a blockquote referencing CFPB enforcement actions and the "$1,800 in unnecessary interest" example) and nine distinct industry profiles distributed across the three variations — nonprofits, freelance consultants, healthcare professionals, VC-backed startups, real-estate brokers. The sections that shipped, shipped well.
The problem is what didn't ship. The "Charts & Visualizations" section was a textual punt: "The content for this section depends on the specific charts you want to visualize" — words that should never appear in a published deliverable. The Visual Assets Appendix followed the same pattern, listing image descriptions as prose rather than rendering anything. On a financing topic where a chart of "Cost of Dealer Markup Over 60 Months" would materially improve reader comprehension, ChatGPT simply opted out.
Prompt engineering was solid but conventional: clean 4-section architecture, 9-item and 6-item breakdowns, clearly labeled parameter fields. Five creative use cases per variation, four pro tips, five FAQs. Everything a competent deliverable needs — but nothing that genuinely surprised or elevated the reader past what they could have found on NerdWallet.
Score breakdown: D1=8.0, D2=8.0, D3=7.5, D4=7.5, D5=7.0, D6=7.5, D7=5.5. The D7 score (Completeness) is the drag: two missing deliverables in a single post is the kind of defect that prevents readers from sharing and undermines the publication's editorial credibility. Evidence of strength: "0.5% dealer reserve on $35K = $450-500" — specific, unhedged, exactly the brand voice the topic demands.
Claude — 85.5 / 100 (Runner-Up)
Claude delivered the most substantive depth of the three platforms, and it wasn't close. Twelve named-character case studies across the three variations, each with geography, FICO score, financial context, and exact math: Maya in Denver with a 658 FICO facing a nonprime rate; James and Priya in Chicago with a totaled minivan; Tyler in Atlanta with $4,500 in negative equity; Renata in Phoenix self-employed as a landscaper; Yasmin and Hakim in Boston as dual physicians running the superprime-lender arbitrage. These aren't generic archetypes — they're concrete humans with concrete numbers, and they transform abstract financing concepts into scenarios the reader can pattern-match against their own life.
The Advanced variation uses a multi-turn gated workflow — the AI asks confirmation questions between deliverables rather than dumping all four at once — which is a genuinely unusual engineering choice that models how a human financial advisor would actually work the problem. The Key Takeaways coda at the end (five orange-blockquoted principles including "Pre-Approval is Your Leverage" and "The 2% Spread") plus three editorial image prompts for designers (Forbes-style, WSJ illustration, Fortune magazine spread) add production-ready polish.
Where Claude lost ground: a template-adherence defect. The inline per-variation citation blocks render as HTML-escaped text — literal <a href="...">Label</a> strings displayed to the reader instead of clickable links. The defect appears three times (lines 100-104, 204-209, 377-385 of the source file). The consolidated "Sources & Citations" section at the end renders properly, so the post is still usable, but the duplication of citation architecture with one broken format is exactly the kind of detail that a close-reading editor catches and asks to be fixed before publication.
Score breakdown: D1=9.5, D2=7.5, D3=10.0, D4=9.0, D5=8.0, D6=8.5, D7=6.0. D3 is a perfect ten — the named-character depth is unmatched. D7 is the visible drag. If the citation defect had been resolved before submission, Claude's score would have climbed to roughly 87.5-88 and this week's winner would almost certainly have been Claude instead of Gemini.
Gemini — 87.5 / 100 (Winner)
Gemini won this week on engineering polish, template completeness, and pedagogical framing — three things that compound in a post whose entire purpose is to teach the reader how to think about financing, not just hand them a checklist. The evidence is in the structure of every section.
Prompt breakdowns don't just explain what each part of the prompt does — they include italicized "Transferable principle:" sentences after every segment that teach a generalizable prompting lesson. Example from the Beginner breakdown: "Do not use hedging language like 'both have pros and cons'; give me direct, actionable advice. Transferable principle: Explicitly forbidding hedging forces the AI to make definitive recommendations, eliminating decision fatigue." This is a pedagogical innovation the other two platforms did not attempt. Readers don't just learn this week's prompt — they learn a technique they can reuse forever.
Actionability got the same treatment. Every variation includes an "Exact input the user would provide" blockquote — literal copy-paste text like "My estimated credit score is: 660. My budget ceiling for the vehicle is: $25,000. The vehicle category I am buying is: Used. Do I have a trade-in?: No." — that removes every last friction point between the post and the reader's actual ChatGPT/Claude/Gemini window. The follow-up prompts each include an italicized "Why this is valuable" rationale, so the reader understands not just what to ask next but why.
And Gemini was the only platform that actually rendered charts. Three full SVG bar charts — Cost of 2% Dealer Markup Over 60 Months, APR by Credit Score Tier (2026), and Vehicle Depreciation in Years 1-3 — drawn in brand colors (#FF4E00 orange, #000000 black, #DCDCDC gray), with real data, proper axes, source attributions, and inline captions. On a topic where a visual of "Nonprime buyers pay 8.95% APR while Superprime buyers pay 4.66%" tells the whole story in two seconds, this is the difference between a blog post and a piece of financial journalism.
Where Gemini lost ground: depth of examples. Only four recurring personas (First-Time Buyer, Family with Negative Equity, Superprime Arbitrageur, Small Business Owner with DTI) used across all three variations — the same archetypes at escalating complexity rather than Claude's twelve distinct named characters. Gemini trades variety for pedagogical rhythm; readers encounter the same archetype in Beginner, then see it complexified in Intermediate, then financially modeled in Advanced. It's a defensible editorial choice, but it scores lower on a depth-over-presence rubric.
Score breakdown: D1=9.0, D2=9.5, D3=7.0, D4=8.5, D5=9.0, D6=9.0, D7=9.5. Six of seven dimensions score 8.5 or above. The D3 dip (practical examples) is the only clear weakness. Gemini's total-package execution is why it wins this week.
Verdict
Gemini wins Week 3 with 87.5 out of 100, a 2.0-point margin over Claude's 85.5. Under the recipe's default tie-threshold (≤3.0 points), this would have registered as a dual-winner tie — but the editorial call was to honor the highest normalized score and declare Gemini the outright winner. Readers should understand this: Claude's post is exceptional work, and if the rubric weighted practical examples more than template completeness, Claude would have taken the week.
Gemini's edge is that it treated this as a teaching post, not just an answering post. The "Transferable principle" pattern in every prompt breakdown turns the reader into a better prompter going forward. The "Exact input" copy-paste blockquotes remove all friction between reading and acting. And the three rendered SVG charts mean a non-technical reader grasps the stakes — a 2% APR markup costing $2,018 over 60 months, a 45% depreciation curve by year three — in the time it takes to glance at a figure rather than parse a paragraph.
Claude's post is genuinely superior on depth. Twelve named characters with specific math is a level of editorial craft the other two platforms did not attempt. But Claude shipped with a visible defect — inline citations rendering as HTML-escaped text — and depth without template discipline is a manuscript, not a published deliverable. On a publication's editorial grid, the comparison is asymmetric: you can add depth to polish, but you can't un-ship a defect.
ChatGPT sits a full tier behind both at 74.25. The accessible prose is real and the industry variety is real, but the two D7 punts (Charts section, Visual Assets Appendix) are the kind of shortfall that suggests the platform optimized for "looks complete" over "is complete." In a topic where a bar chart of APR tiers is the most valuable single visual the reader could see, failing to render one while listing it as a section header is the gap between a draft and a ship-ready post.
The Transferable Takeaway
The difference between a 75-point post and an 85-point post is rarely the core content — all three platforms knew the financing material. The difference is execution discipline: Do the charts render? Do the citations click? Does the post teach a generalizable skill, or does it just answer this one question? When you prompt any of these three AIs, you should demand explicit rendered artifacts (tables, charts, scripts), explicit teaching framing ("Transferable principle: …"), and explicit copy-paste inputs. These three techniques compounded to give Gemini a 13-point edge over ChatGPT this week.
If you're using AI to prepare for any high-stakes purchase — car, house, business equipment, capital lease — the prompt architecture that won this week is the one to copy. Demand role specification. Demand hedging suppression. Demand a printable comparison matrix. Demand conditional logic ("if my parameters indicate X, model Y"). Demand scenario-based scripts for every likely counterparty response. The F&I office relies on information asymmetry; a well-engineered prompt closes that asymmetry before you walk in the door.
Score Summary
| Dimension | Weight | ChatGPT | Claude | Gemini |
|---|---|---|---|---|
| D1 — Prompt Quality & Engineering Depth | 20 | 8.0 | 9.5 | 9.0 |
| D2 — Prompt Breakdown Clarity | 15 | 8.0 | 7.5 | 9.5 |
| D3 — Practical Examples & Industry Relevance | 15 | 7.5 | 10.0 | 7.0 |
| D4 — Writing Quality & Brand Voice | 15 | 7.5 | 9.0 | 8.5 |
| D5 — Creative Use Cases & Unexpected Angles | 10 | 7.0 | 8.0 | 9.0 |
| D6 — Actionability & Reader Value | 15 | 7.5 | 8.5 | 9.0 |
| D7 — Completeness & Template Adherence | 10 | 5.5 | 6.0 | 9.5 |
| Normalized (0-100) | 100 | 74.25 | 85.5 | 87.5 |
Visual Comparison
Overall Score Comparison
Dimension-by-Dimension Breakdown
Gemini leads or ties on five of seven dimensions (D2, D5, D6, D7 outright; D1 close second). Claude's D3 perfect ten (practical examples via named characters) is the highest single-dimension score of the week but cannot fully offset Gemini's systematic D2/D5/D6/D7 advantage. ChatGPT is middle-of-pack across the board with no standout strength and a visible D7 drag.
The Prompts Behind the Posts
For readers who want to run these prompts themselves, here are the three platforms' Beginner prompts side-by-side. The Intermediate and Advanced variations escalate in complexity but follow the same architectural choices visible at the Beginner level.
ChatGPT — Beginner
You are a patient, consumer-first auto finance expert. I am getting ready to buy a car and have never shopped for an auto loan independently before. I need a 1-page, printable checklist to complete this week so my financing is locked in before I walk into a dealership. Parameters: Estimated credit score [insert]; Budget ceiling [insert]; Vehicle type [New / CPO / Used]; Trade-in [Yes/No, details]. Cover: (1) credit tier and APR expectation; (2) pre-approval process; (3) trade-in strategy or down payment strategy; (4) top 3 dealer financing traps with a one-sentence defense for each. Format as a printable checklist with checkboxes. No hedging.
Claude — Beginner
Act as a veteran credit-union loan officer advising a first-time buyer. I need a one-page, printable, week-long financing readiness checklist. Use these parameters: [credit score, budget, vehicle type, trade-in status]. Structure the response as four sections — Credit Tier Analysis, Pre-Approval Playbook, Trade-In or Down Payment Strategy, and Dealer F&I Trap Defenses. For each section, include: (a) the specific action I take this week, (b) the exact math or document I need to gather, (c) a concrete anchor (e.g., "for a $35K loan at your tier, expect a spread of X%"), and (d) a named-character example showing a real-world application. Avoid hedging language; give me definitive guidance based on my parameters.
Gemini — Beginner ★ Winner
Act as a patient, consumer-first auto finance expert. I am preparing to buy a car soon and have never independently shopped for an auto loan before; I usually just take whatever the dealer offers. I need a straightforward, actionable, 1-page checklist to complete this week to get my financing ready. Here are my parameters: My estimated credit score is: [Insert]. My budget ceiling is: $[Insert]. The vehicle category: [New / CPO / Used]. Do I have a trade-in?: [Yes/No, Year/Make/Model, miles]. Provide a clear, step-by-step checklist covering four areas. Do not use hedging language like "both have pros and cons"; give me direct, actionable advice. (1) Credit Tier & Timing: Tell me my APR tier (Superprime/Prime/Nonprime/Subprime) based on VantageScore/FICO Auto Score models. Tell me directly whether delaying 30-60 days is financially worth it for my specific budget. (2) Pre-Approval Process: Walk me through getting pre-approved from a credit union or bank. What documents? How long? Does applying hurt my credit? (3) Trade-In Strategy (if applicable): Three valuation methods (online instant offer, dealer appraisal, private sale) — when to use each. If no trade-in, explain down payment strategy. (4) Dealer Financing Traps: The top 3 specific traps (monthly payment focus, hidden APR, 84-month stretch) — one sentence on how to shut down each. Format as a printable checklist with checkboxes.
Notice the structural pattern: all three prompts use role assignment and parameter specification, but only Gemini explicitly names specific models (VantageScore/FICO Auto Score), explicitly names the traps (84-month stretch), and explicitly forbids hedging. The prompt that produces the best output is almost always the prompt that leaves the least ambiguity for the AI to paper over.
Methodology
Every Monday, Ketelsen.ai selects a weekly topic within an active series (currently "AI at the Dealership"). The same prompt architecture is submitted to Claude, ChatGPT, and Gemini under identical conditions: same parameters, same instructions, same expected output format. Each platform's response is captured as a standalone Squarespace-ready HTML post. Those three posts are then graded against a seven-dimension rubric (v2.0) with published anchor definitions, 1-10 scores with 0.5-point precision, and a weighted-sum scoring formula normalized to a 0-100 scale.
The rubric is intentionally weighted toward what non-technical readers actually need — prompt quality and engineering depth carries 20%, practical examples and actionability each carry 15%, and completeness (does the thing actually ship?) carries 10% as a floor on editorial discipline. Ties within 3.0 points are treated as dual-winner outcomes by default; this week Richard exercised the editorial tiebreak option to honor the highest normalized score and declare an outright winner.
Scoring was performed via a two-pass calibration procedure: first a side-by-side content inventory across all three platforms for each dimension, then absolute scoring with explicit evidence anchors. Evidence quotes throughout this post are pulled verbatim from the source posts.
Metadata
Series: AI at the Dealership (Week 3)
Topic: Getting Your Money Right Before You Shop — Financing Readiness
Winner: Gemini (87.5 / 100)
Runner-Up: Claude (85.5 / 100)
Third: ChatGPT (74.25 / 100)
Margin (Gemini vs. Claude): 2.0 points (within tie threshold; editorial tiebreak applied)
Rubric Version: v2.0 (7 dimensions, weighted)
Reading Time: ~12 minutes
SEO Title: Claude vs ChatGPT vs Gemini: AI Auto-Financing Prompt Comparison
SEO Description: We ran the same auto-financing readiness prompt through Claude, ChatGPT, and Gemini. Gemini wins Week 3 with rendered charts, pedagogical framing, and copy-paste inputs — beating Claude's exceptional depth by 2 points.
Primary Tags: AI Prompt Comparison, Auto Financing, Claude vs ChatGPT vs Gemini, F&I Negotiation, Dealer Reserve, Pre-Approval Strategy
Categories: Cross-Platform Comparisons, AI at the Dealership, Financial Strategy
Difficulty Levels Covered: Beginner, Intermediate, Advanced (in source posts)
Published: 2026-04-20