$ cd /case-studies/claude/ ← back

Claude AI — product case study

Improving Claude's creative-tool surface: I led discovery, prioritization, and the PRD for image-generation integration.

Written Mar 10, 2025

$ read overview 01 / 06

Executive Summary

A product case study on improving Claude's creative-tool surface. I led discovery, prioritization, and the PRD for image-generation integration — taking the work from user-research signals to a shippable plan.

Anthropic PBC is an American AI company founded in 2021 by former OpenAI employees, including siblings Daniela and Dario Amodei. Claude is its flagship assistant — a family of LLMs designed to be helpful, honest, and harmless, used today by businesses, developers, educators, marketers, and individual users across free and paid tiers.

Available models

Claude Opus

The highest-performing model for complex analysis and advanced tasks

Exceptional reasoning capabilities
Highest accuracy for complex tasks
Advanced problem-solving
Nuanced understanding of context

Claude Sonnet

Balances capability and performance for efficient, high-throughput tasks

Excellent balance of speed and capability
Ideal for most business applications
Cost-effective for daily use
Strong multilingual support

Claude Haiku

Optimized for speed and lightweight actions

Ultra-fast response times
Efficient for simple tasks
Low computational requirements
Ideal for mobile applications

Claude Sonnet 3.7

Latest model featuring hybrid reasoning capabilities (released February 2025)

Hybrid reasoning architecture
Enhanced problem-solving
Improved contextual understanding
Advanced tool usage capabilities

Core functionalities

Milestones

2021
- Founded by seven former OpenAI employees
2022
- Received $580M in funding, including $500M from FTX
2023
- Officially introduced Claude to the public
- Secured a $4B partnership with Amazon
- Received a $2B commitment from Google
2024
- Released Claude 3 with three models: Opus, Sonnet, Haiku
- Launched Claude Team plan and iOS app
- Released Claude 3.5 Sonnet with improved performance
- Added "Computer use" feature to Claude
- Partnered with Palantir and AWS for U.S. intelligence agencies
- Made Claude 3.5 Haiku available to all users
2025
- Introduced Claude 3.7 Sonnet with hybrid reasoning capabilities

$ read discovery 02 / 06

Problem Discovery

I set out to identify Claude's most important pain points and the opportunities to better serve them. This research formed the foundation for the prioritized problem statements that follow.

Reddit data analysis

Sampled 150 Reddit posts mentioning Claude (r/ChatGPT, r/Claude, r/ArtificialIntelligence) plus 120 complaint and feedback threads.

Top use cases

Writing Assistant 42
Research 35
Code Helper 28
Learning 22
Creative Writing 18

Top pain points

Hallucinations 38
Cost 30
Limited Knowledge 25
Slowness 20
Privacy 15

Key insights

Writing assistant is the most common use case — drafting, editing, and refining content.
Research assistance is highly valued for synthesizing information across multiple sources.
Hallucinations remain the top concern, particularly for factual or technical content.
Users compare Claude to ChatGPT favorably for conversational depth, less so for technical knowledge.
Many users switched to Claude specifically for its larger context window and more nuanced responses.

User interviews

Jordan Garcia

24 · Fresno, California

Bio

Senior CIS major at Fresno State. Tech-savvy student using AI assistants daily for academic work and personal projects. Deep interest in machine learning; relies on AI to understand complex concepts and complete coding assignments.

Quote

"Claude is better at helping me with the machine learning stuff than ChatGPT. The way it explains things makes more sense to me."

Core needs

Help understanding complex algorithms and coding concepts
Assistance with academic writing and research
Summarization of technical material
Scheduling and organization support
Code solutions for ML projects

Frustrations

Python version conflicts and dependency management
Switching between AI tools for different capabilities
Initial setup friction on new projects
File-management limitations requiring third-party tools

Pain points

Context loss when asking for summarization or paraphrasing
Manual rework to adapt suggestions to specific needs
No visual output for certain projects

Ideal solution

An assistant that combines Claude's strengths in explaining ML concepts with image generation, better contextual summarization, and a UX that makes it the go-to for all tasks rather than tool-switching.

$ read prioritization 03 / 06

Problem Prioritization

I used a weighted scoring model to rank candidate problems by user impact, technical feasibility, and business value, surfacing the highest-leverage work to address first.

Prioritized problem statements

Problem 1 Highest priority

Response Complexity Problem

How might we provide users with appropriately detailed responses that match their specific needs without requiring additional prompting?

Impact: Improve core UX, compete with Grok
Metrics: Reduced follow-up prompts

Problem 2 High priority

Python Dependency Management

How might we enhance Claude's assistance to account for Python environment constraints?

Impact: Strengthen position as coding assistant
Metrics: Increased usage for Python projects

Problem 3 Medium priority

Creative Capabilities Gap

How might we expand Claude's capabilities to include image generation and editing?

Impact: Meet user needs, open new use cases
Metrics: Feature adoption, reduced switching

Problem 4 Medium priority

Research Depth Limitations

How might we enhance Claude's research capabilities across multiple sources?

Impact: Position as complete research assistant
Metrics: Increased research-related prompts

Prioritization framework

Weighted scoring model

I scored each problem on a 1–5 scale across five weighted criteria, then summed weighted scores to determine final priority.

User Impact — weight 2.0
Reach — weight 1.5
Business Value — weight 1.8
Competitive Differentiation — weight 1.2
Technical Feasibility — weight 1.0

Results

Creative Capabilities Gap 33.0 pts
Voice Input Feature 29.0 pts
Research Depth Limitations 27.3 pts
Response Complexity 25.0 pts
Python Dependency Management 17.8 pts

Key insights & recommendations

Creative Capabilities (33.0 pts)

Market growth: 17.4% CAGR through 2030
Competitive gap: Major competitors offer this
New revenue: Opens up new use cases

Voice Input (29.0 pts)

Accessibility: Expands to voice users
Industry trend: Toward multimodal interfaces
Effort: Moderate; uses existing tech

Research Depth (27.3 pts)

In progress: "Compass" feature in testing
Lower priority: Competitors developing similar

#4–5

Other Priorities

Response Complexity: Addressed by extended thinking mode (25.0 pts)
Python Dependencies: Limited reach, recently improved (17.8 pts)

$ read solution 04 / 06

Problem Solution

Creative Capabilities Gap (33.0 pts) was the highest-priority problem. I generated diverse solution paths, then narrowed down to a third-party API integration — and selected Midjourney as the provider after a structured comparison.

Brainstorming diverse solutions

Third-party API integration

High impact

Idea: Leverage existing models via APIs
Feasibility: High; many robust APIs exist
Impact: Rapidly enhances platform capabilities

In-house development

Moderate impact

Idea: Develop proprietary image generation
Feasibility: Low; requires significant resources
Impact: Long-term strategic differentiation

Hybrid model with refinement

High impact

Idea: Combine generation with refinement tools
Feasibility: Moderate; requires integration work
Impact: Boosts satisfaction with personalization

Creative platform integration

High impact

Idea: Partner with platforms like Adobe
Feasibility: Depends on partnership agreements
Impact: Leverages tools users already trust

Provider selection

After evaluating the solution paths, third-party API integration offered the best balance of impact, feasibility, and time-to-market. I compared the three leading providers:

Provider	Image quality	UI components	Integration	Score
Midjourney Selected	9/10 (27)	10/10 (30)	8/10 (16)	112
DALL-E	8/10 (24)	6/10 (18)	9/10 (18)	98
Stable Diffusion	8/10 (24)	7/10 (21)	8/10 (16)	100

Why Midjourney

Superior UI & customization: Robust components that appeal to Claude users
High image quality: Artistic outputs meeting creative standards
Competitive integration: Strong developer support and documentation
Cost & scalability: Proven pricing models and reliable performance

Implementation plan

Phase 1 — Initial integration

Connect Claude API with Midjourney via custom wrapper
Implement basic prompt-to-image conversion
Timeline: 4–6 weeks for MVP

Phase 2 — Enhanced features

Add image editing and refinement capabilities
Implement context-aware image suggestions
Timeline: 2–3 months after initial release

Success metrics

User adoption rate: >40% in first 3 months
Satisfaction score: >4.2/5 for image generation
Reduction in platform switching: 30%+

Expected outcomes

Increased user satisfaction and retention
New revenue opportunities through premium tiers
Competitive advantage over single-modal AI assistants

$ read design-prototypes 05 / 06

Design Implementation

The proposed design implements a seamless image-generation workflow inside Claude. Four stages take a request from natural-language input to assets integrated into the user's working files.

Stage 1: Entry point

Entry point interface — The initial interface maintains Claude's minimalist aesthetic with a clean, focused design.

Key features

Familiar environment with a clearly defined input area
Simple prompt bar for natural-language interaction
No specialized commands required to initiate image generation

Design philosophy

The entry point keeps Claude's minimalist aesthetic while subtly introducing image generation. The interface prioritizes familiarity for existing users while making the new capability discoverable without overwhelming the chat experience.

Key design considerations

Accessibility first: The conversational interface makes advanced image generation accessible to non-technical users.
Contextual continuity: The design maintains connection between generated images and their intended purpose throughout the workflow.
Progressive disclosure: Complex options surface only when relevant, preventing cognitive overload.
Visual feedback: Clear presentation of results with multiple options encourages experimentation and refinement.
Seamless integration: Generated assets become immediately available for use in other creative contexts.

$ read prd 06 / 06

Product Requirements Document

The PRD that synthesizes the work above into a shippable specification: integrating Midjourney's image-generation API into Claude.

TL;DR

This project integrates Midjourney's image-generation API into the Claude platform, enabling users to create and manage AI-generated images directly within conversations. It addresses a key user need for creative visual capabilities, drives richer collaboration for content creators, developers, and businesses, and lands streamlined UX, high-quality outputs, and seamless workflow integration as the big wins.

Business goals

Increase user engagement on Claude by 25% within six months of launch.
Reduce platform-switching to other AI tools by 30%.
Increase paid-plan conversions by 15%.
Strengthen Claude's competitive position against other AI assistants.
Enable new monetization opportunities around premium image features.

User goals

Create high-quality images directly within Claude conversations.
Easily refine and iterate on generated images.
Seamlessly integrate generated images into their workflows.
Experience consistent image quality across devices and platforms.
Share and collaborate around visual content.

Non-goals

Building an in-house image generation model from scratch.
Competing with dedicated graphic-design tools.
Creating video generation capabilities at this stage.
Complex image editing or manipulation tools.
Integration with stock photography libraries.

User stories

Content Creator

I want to generate images based on my descriptions, so I can visualize ideas without switching platforms.
I want to refine generated images through conversational feedback, so I can iteratively improve outputs.
I want to save and organize my generated images, so I can access them across projects.

Developer

I want to generate UI mockups and concept visuals, so I can prototype ideas quickly.
I want to incorporate generated images into my codebase, so I can streamline development.
I want consistent image outputs that match my specifications, so I can rely on them for professional projects.

Business User

As a marketing manager, I want to create on-brand imagery, so I can maintain consistent visual communications.
I want to generate multiple image variations quickly, so I can pick the best options for presentations.
I want to control who can generate images on my team, so I can manage resource usage.

Functional requirements

Image generation core

Priority: High

Generate images based on natural-language prompts.
Provide multiple style options (photorealistic, artistic, concept art, etc.).
Support various aspect ratios (square, portrait, landscape).
Enable image refinement through follow-up prompts.
Support batch generation of multiple images.

Integration & user experience

Priority: High

Seamless access via icon in the Claude chat interface.
Preview generated images before finalizing.
Clear indication of image generation in progress.
Natural-language control of image parameters.
Mobile-responsive image viewer.

Image management

Priority: Medium

Save generated images to user gallery.
Export images in multiple formats (PNG, JPG).
Organize images by conversation or project.
Share images via link or download.
Delete or archive unwanted images.

User experience flow

Step 1 — Initiate image creation
- User clicks the camera icon or types a natural-language request.
- Modal appears with text field for image description.
- Style options are presented with visual examples.
- Size/ratio selector is available; defaults to square.
Step 2 — Refine request
- User enters detailed description or selects from suggestions.
- AI offers clarifying questions if the prompt is vague.
- Preview of similar-style images appears when available.
- User submits with clear feedback on processing time.
Step 3 — Review results
- Four image variations appear in a grid.
- User can hover to enlarge each option.
- Options to regenerate, refine, or select are clearly presented.
- Selected images appear directly in the conversation.
Step 4 — Iterate or finalize
- User can request adjustments through conversation.
- Changes are applied incrementally with version tracking.
- Final images can be saved to gallery or exported.
- Unobtrusive feedback prompt appears after completion.

Narrative

Jordan, a CS student at Fresno State, is working on a machine-learning project and needs conceptual diagrams to explain complex algorithms. Previously he had to switch between Claude for explanations and another tool for visuals. With the new image-generation feature, Jordan asks Claude to "create a diagram showing how convolutional neural networks process image data."

Within seconds, Claude presents four visual options. Jordan selects one but asks Claude to "make the layers more distinct and add labels." Claude refines the image based on this feedback and incorporates it directly into their conversation about neural networks. He saves the image for his presentation and never had to break flow.

When explaining the concept to classmates, Jordan shares both Claude's text and the visuals together — a more comprehensive learning experience. The time saved and the output quality strengthen his preference for Claude over competitors and lead him to upgrade to a paid plan.

Success metrics

Metric	Objective	Method
Adoption rate	50% of active users try the feature within 3 months	Feature usage tracking
Retention impact	15% increase in retention for users who use image features	Cohort analysis
Conversion rate	15% increase in free-to-paid conversions	Plan upgrade tracking
Image generation success	98% successful completions	Error-rate monitoring
User satisfaction	CSAT score > 4.5 / 5 for image generation	Post-usage surveys

Project timeline

Medium-large: 8–10 weeks end-to-end, including testing and staged rollout.

1Design & planning (2 weeks)
2Core API integration (2 weeks)
3Frontend implementation (3 weeks)
4Testing & optimization (2 weeks)
5Launch & monitoring (1 week)

Executive Summary

Available models

Core functionalities

Milestones

Problem Discovery

Reddit data analysis

Top use cases

Top pain points

Key insights

User interviews

Bio

Quote

Core needs

Frustrations

Pain points

Ideal solution

Problem Prioritization

Prioritized problem statements

Response Complexity Problem

Python Dependency Management

Creative Capabilities Gap

Research Depth Limitations

Prioritization framework

Weighted scoring model

Results

Key insights & recommendations

Problem Solution

Brainstorming diverse solutions

Provider selection

Why Midjourney

Implementation plan

Design Implementation

Key features

Design philosophy

Key design considerations

Product Requirements Document

TL;DR

Business goals

User goals

Non-goals

User stories

Functional requirements

User experience flow

Step 1 — Initiate image creation

Step 2 — Refine request

Step 3 — Review results

Step 4 — Iterate or finalize

Narrative

Success metrics

Project timeline