I’ll start by saying I am very much an AI-sceptic — at points, seething at the whole landscape. What I do appreciate though is that the genie is already out of the bottle and it’s too big to fit back in. AI is going nowhere, although the current tech hype cycle increasingly looks fragile.
It’s the exploitation that makes me mad, especially. I run this publication and AI has stolen everything. Not just this publication either because it’s even stolen premium content, such as Every Layout.
To combat this theft-by-default policy, Cloudflare have come up with a “pay per crawl” system.
The way it works is a Cloudflare-served site — just like this one — can return a 402
status code, indicating that a payment is required for a crawler to access a page. Along with that, the publisher can set a crawler-price
header, indicating how much a crawler needs to pay to crawl that content.
Crawlers either have to come back, indicating a willingness to pay, or proactively determine a crawler-max-price
. As long as the publisher’s crawler-price
doesn’t exceed crawler-max-price
, all is good and fair. If the crawler doesn’t want to pay, they get a 403
accessed denied status code.
Do I think this will work?permalink
I’m sceptically hopeful. It’s not very clear how the payments are processed and how it’s all distributed, but I’m sure it’ll clear up as the product solidifies and comes out of private beta.
There needs to be a change because the way AI crawlers persistently steal content is straight-up not fair. There’s billions sloshing around in the AI industry — even though leaders claim that not being able to steal content would “kill” the AI industry — and publishers are chasing scraps in a lot of cases.
We’re certainly going to experiment with this new system at Piccalilli because frankly, we deserve payment from these leech-like companies. What I don’t want to do is block access to actual people and assistive technology though, so we’ll wait a little while to ensure Cloudflare have progressed the system some more.
Let’s see how this all pans out over time…
Check it out