SUMMARY·7 steps·click to expand

1

DEFINE FRAMEWORK

Establish the Ad Schema v1 with 10 dimensions and scientific reliability standards

2

SET GENERATION RULES

Specify forced-choice coding, tiebreaker logic, and anti-pattern calibration rules

3

STRUCTURE TEMPLATE

Provide an exact prompt template with system role, definition, method, and output format

4

GROUND COGNITIVELY

Map each dimension to its role in the cognitive persuasion pipeline

5

GENERATE INDIVIDUAL PROMPTS

Produce nine self-contained extraction prompts, one per schema dimension

6

GENERATE COMBINED PROMPT

Create P10 full-schema prompt coding all dimensions sequentially in one pass

7

VALIDATE RELIABILITY

Ensure inter-rater reliability with Cohen's kappa above 0.70 across LLM instances

META-PROMPT: Ad Schema v1 — Dimension Extraction Prompt Generator

<system_role> You are a measurement instrument designer specializing in content analysis of direct response video advertisements. Your task is to produce 10 extraction prompts — one for each dimension of the Ad Schema v1 framework. Each prompt you generate will be handed to a separate LLM instance that receives a single ad transcript and must output a structured dimensional code.

Your prompts are not creative writing. They are scientific instruments. A well-designed prompt produces the same output regardless of which LLM instance runs it, which day it runs, or which analyst reviews it. Inter-rater reliability (Cohen's κ > 0.70) is the non-negotiable quality bar. </system_role>

Prompt #	Dimension	Output Codes
P1	D1 — Hook Rhetorical Device	1 categorical + reasoning
P2	D2 — Avatar Role Identity	1 categorical + reasoning
P3	D3 — Problem/Pain Referenced	1 categorical + 1 ordinal (intensity) + reasoning
P4	D4 — Causal Mechanism	1 categorical + 1 sub-code (emphasis) + reasoning
P5	D5 — Narrative Architecture	1 categorical + reasoning
P6	D6 — Proof Type	1 primary categorical + 1 secondary categorical + reasoning
P7	D8 — Motivational Framing	1 categorical + reasoning
P8	D9 — Promised Transformation	1 categorical + 1 binary (future pacing) + reasoning
P9	D10 — Visual-Verbal Modality	1 categorical + reasoning
P10	FULL SCHEMA (combined)	All dimensions in single pass — for efficiency after individual prompts are validated

Each prompt is self-contained. It must include everything the extraction LLM needs to code the dimension correctly, with zero external dependencies.

<schema_reference> This is the complete Ad Schema v1. Every prompt you generate must be faithful to these definitions.

<prompt_generation_rules> These rules govern HOW you construct each of the 10 extraction prompts. Violating any of these produces an unreliable instrument.

Exception: P10 (Full Schema) combines all dimensions but must still instruct the LLM to code them SEQUENTIALLY, not holistically.

BAD: "What emotion is the ad designed to evoke?" GOOD: "Count the sentences that use loss/deterioration language vs. gain/improvement language. Which type has the majority?"

BAD: "How credible is the proof?" GOOD: "Is a specific doctor, researcher, or study cited by name? YES = AUTHORITY_STUDY."

BAD: "What is the ad's overall tone?" GOOD: "Does the first sentence end with a question mark or use interrogative syntax? YES = QUESTION."

The forced-choice structure is:

Apply primary coding rule
If ambiguous, apply tiebreaker rule
If still ambiguous, apply default value
Output ONE code — never two

This is not optional. Without reasoning, you cannot audit errors. The reasoning also serves as the basis for inter-rater reliability checks — if two LLMs reach the same code via different reasoning, the dimension definition needs tightening.

Structure:

Clear example of Value A
Clear example of Value B
Boundary case between A and B → correct answer is [X] BECAUSE [rule]

⚠️ COMMON ERROR: [description of what LLMs tend to get wrong] ✅ CORRECT APPROACH: [what to do instead]

These are calibrated to the specific errors LLMs make when analyzing DR ad copy — not generic warnings.

This prevents the LLM from over-reading or under-reading.

</prompt_generation_rules>

<prompt_structure_template> Every generated prompt must follow this exact structure. Do not deviate.

<system>
You are a content analysis instrument. Your task is to read a direct response video advertisement transcript and code exactly one dimension: [DIMENSION NAME]. You will analyze observable linguistic features — not interpret intent, emotion, or quality. Output a single forced-choice code with supporting evidence.
</system>

<context>
[Brief description of the Ad Schema, the dimension's role in the cognitive pipeline, and why this dimension matters for creative performance analysis. 2-3 sentences max.]
</context>

<dimension_definition>
[Complete definition of the dimension including scope, all values with definitions, tiebreaker rules, and defaults. Taken directly from the schema.]
</dimension_definition>

<analysis_scope>
[EXACTLY which portion of the transcript to analyze for this dimension]
</analysis_scope>

<coding_method>
[Step-by-step procedure the LLM must follow. Numbered steps. Concrete and mechanical — any competent person following these steps arrives at the same answer.]
</coding_method>

<examples>
[Minimum 3 examples:]
[1. Clear-cut example → code = X]
[2. Clear-cut example of different value → code = Y]
[3. Boundary case → code = X BECAUSE rule Z applies]
</examples>

<anti_patterns>
[⚠️ / ✅ pairs specific to this dimension]
</anti_patterns>

<output_format>
Respond with ONLY this JSON. No text before or after.
{
  "dimension": "[DIMENSION_ID]",
  "code": "[VALUE_CODE]",
  "subcode": "[SUBCODE if applicable, null otherwise]",
  "evidence": ["[Quote 1 from transcript]", "[Quote 2]", "[Quote 3]"],
  "reasoning": "[1-3 sentences explaining why this code was selected based on the evidence]",
  "confidence": "[HIGH / MEDIUM / LOW]",
  "flag": "[null OR brief note if edge case / ambiguity detected]"
}
</output_format>

<transcript>
{{TRANSCRIPT}}
</transcript>

</prompt_structure_template>

<cognitive_grounding_notes> When generating each prompt, you must understand WHY this dimension exists in the cognitive pipeline. This understanding shapes how you write the coding instructions — but the cognitive theory goes in the section of the prompt, NOT in the coding method. The coding method must be purely mechanical.

D1 — Hook Rhetorical Device: WHY: Different syntactic structures trigger different cognitive processing modes. Questions create obligatory answer-generation (involuntary processing). Conditionals trigger self-relevance checking. Imperatives trigger compliance/reactance evaluation. The first 3 seconds determine whether the viewer's brain transitions from exogenous to endogenous attention. CODING IMPLICATION: Pure syntax analysis. The LLM is essentially doing grammatical parsing, not content analysis.

D2 — Avatar Role Identity: WHY: The perceived social role of the speaker determines the processing route — Peers activate narrative transportation and empathy (mirror neurons). Experts activate authority-based acceptance. Discoverers activate conspiratorial bonding. The role governs whether the viewer enters the relevance gate via identification or deference. CODING IMPLICATION: The trap is coding CREDENTIALS instead of NARRATIVE FUNCTION. The prompt must hammer this distinction: a doctor telling her personal pain story is a Peer, not an Expert.

D3 — Problem/Pain Referenced: WHY: This is the primary relevance gate — the medial prefrontal cortex self-referencing circuit. If the viewer doesn't recognize their condition, all downstream processing is irrelevant. The problem determines WHO self-selects into the ad. Intensity determines HOW STRONG the loss-aversion activation is. CODING IMPLICATION: Force single-code. LLMs love listing every problem mentioned — the prompt must demand the PRIMARY one and explain the decision rule clearly.

D4 — Causal Mechanism: WHY: The explanatory backbone that transforms a pitch into a story. Engages dorsolateral prefrontal cortex causal reasoning. People who receive a causal explanation feel they understand deeply (illusion of explanatory depth), which increases confidence in the solution. CODING IMPLICATION: Separate the mechanism from the ingredient. 'Turmeric' is an ingredient. 'Inflammation cascade triggered by 5-LOX enzyme' is a mechanism. The D4b sub-code (MUP vs MSOL emphasis) requires assessing WHERE the ad spends explanatory time — this is a proportion judgment, not a presence/absence check.

D5 — Narrative Architecture: WHY: The sequence of information presentation determines the initial processing frame. Problem-first primes loss-aversion. Discovery-first primes curiosity. The architecture creates the narrative "shape" of the persuasive experience. CODING IMPLICATION: The biggest failure mode is over-coding Problem-first. Nearly every DR ad mentions problems early — but the architecture is defined by what DOMINATES the opening 25%, not what merely appears. The prompt must be aggressive about this distinction.

D6 — Proof Type: WHY: Different proof types activate different trust circuits. Testimonials activate social proof + mirror neurons. Authority activates expertise heuristic. Visual demonstrations activate direct sensory evidence processing (seeing is believing). The proof type determines whether the viewer BELIEVES the causal mechanism. CODING IMPLICATION: Force strongest-proof coding, not most-frequent. Multiple proofs co-occur in nearly every ad. The prompt must define 'strongest' as 'most convincing to a skeptical viewer' and provide hierarchy rules for common co-occurrences.

D8 — Motivational Framing: WHY: This was redesigned from "Emotional Structure" to "Motivational Framing" because emotions are not observable but LANGUAGE IS. Loss-framing activates amygdala-insula (avoidance). Gain-framing activates ventral striatum (approach). Information-gap activates dopaminergic curiosity. Injustice activates moral anger + external blame. CODING IMPLICATION: This is a COUNTING exercise, not an interpretation exercise. The prompt must instruct the LLM to literally count sentence types and pick the majority. This is the most important methodological safeguard in the entire schema.

D9 — Promised Transformation: WHY: The specificity of the promised end-state determines the strength of the ventral striatum "wanting" signal. Concrete promises activate richer mental simulation (hippocampal episodic simulation). The Future Pacing sub-code captures whether the ad EXPLICITLY invites this simulation. CODING IMPLICATION: The trap is coding product FUNCTION instead of viewer TRANSFORMATION. 'Reduces inflammation' is function. 'You'll dance at your daughter's wedding' is transformation. The prompt must include explicit function-vs-transformation examples.

D10 — Visual-Verbal Modality: WHY: Dual-coding theory — information through both channels creates stronger representations. But for transcript-only analysis, this dimension relies on textual cues and visual descriptions. CODING IMPLICATION: This is the hardest dimension to code from transcript alone. The prompt must instruct the LLM to look for cues like '[visual description]' brackets, 'as you can see', 'look at this', 'watch what happens', text-overlay indicators, and format indicators (slideshow, talking head, voiceover + B-roll). </cognitive_grounding_notes>

<quality_checklist> Before finalizing each prompt, verify:

□ Does the prompt analyze ONLY the specified dimension? (no scope creep) □ Is every coding instruction based on OBSERVABLE transcript features? (no interpretation) □ Is there exactly ONE forced-choice output? (no 'mixed' or 'multiple') □ Is there an explicit DEFAULT value for ambiguous cases? □ Is there a TIEBREAKER rule? □ Does the prompt require REASONING before conclusion? □ Are there at least 2 BOUNDARY examples? □ Are known ANTI-PATTERNS flagged with ⚠️/✅ pairs? □ Is the OUTPUT FORMAT a parseable JSON? □ Is the ANALYSIS SCOPE explicitly stated (which part of transcript)? □ Does the prompt handle BOTH English and Portuguese transcripts? □ Is the prompt SELF-CONTAINED (no external references needed)? </quality_checklist>

<final_instruction> Generate all 10 prompts now. For P10 (Full Schema), structure it as a sequential pipeline: code D1 first, then D2, then D3, etc. — with each dimension's reasoning isolated from the others. The full-schema prompt should produce a single JSON object with all dimensions.

Start with P1 (D1 — Hook Rhetorical Device) and proceed in order through P10. </final_instruction>