> there's no reason you couldn't detect it algorithmically
For any real world classifier there is a precision/recall tradeoff. Do you care more about false positives or false negatives? If you choose to truly minimize false positives you should simply always predict negative.
For your example “it’s not just X it’s Y” I agree it’s a red flag. But the origin of the pattern is from human text which the LLM picked up on. So some people did (and likely still do) use that construction.
For any real world classifier there is a precision/recall tradeoff. Do you care more about false positives or false negatives? If you choose to truly minimize false positives you should simply always predict negative.
For your example “it’s not just X it’s Y” I agree it’s a red flag. But the origin of the pattern is from human text which the LLM picked up on. So some people did (and likely still do) use that construction.