I noticed in April that there had been a lot of activity on this domain from one address 159.149.133.234 using an agent called UbiCrawler. Investigation determined that it was an experimental web crawler from an Italian university. They hit me 45,000 times (12%) during that month (but only 4% of the bandwidth) so I ignored it as a one off experiment that may have gone wrong. There is very little about the crawler around except some academic stuff.
But they have hit again this month—36,000 hits (16%) so far but in only two visits using 5% of the bandwidth. They must be trickling the requests out so they don’t impact the load too much but are close enough together that the analysis thinks that they haven’t left.
I think I have enough of them experimenting on my site, I can’t see them being of any benefit. Now to consider how to block them; by address would be the easiest way. Robert tells me that the address range 159.148/15 is in Latvia (which seems strange). Alternatively I could try by User Agent string which would continue to work if they moved.