I released a post yesterday that showed some statistics about “when” my registration spammers make their attempts. I didn’t talk much about where they come from. When I started looking at that, I found something quite interesting.
I ran a query to check the IP address for failed registration attempts on my largest board. Here is the result for the top 25 IP addresses:
+----------+----------+ | log_ip | count(*) | +----------+----------+ | 3a41ef6a | 94 | | 3a41efcd | 94 | | 57766a04 | 84 | | NULL | 81 | | 57766617 | 77 | | 3a41efd2 | 74 | | 3a41efcb | 74 | | 3a41efd5 | 68 | | 57635c26 | 67 | | 3a41efc8 | 66 | | 3a41efc6 | 64 | | 3d992223 | 64 | | 3a41efc9 | 64 | | 3a41efdc | 64 | | 3a41efd4 | 62 | | 3a41efc7 | 62 | | 3a41efd1 | 62 | | 3a41efca | 60 | | 3a41efd9 | 60 | | 3a41efc4 | 60 | | 5042fa09 | 60 | | 3a41efcc | 58 | | 3a41efd0 | 58 | | 3a41ef32 | 58 | | 3a41efd8 | 58 | +----------+----------+
Notice anything? Go ahead and take a closer look. Yes, these IP addresses are encoded using the phpBB2 encode_ip() function, but that’s not what I am talking about. Did you notice how many of them have “3a41ef” as the first six letters?
Dropping An Octet
I noticed that right away, and reran the query to drop the last two characters of the IP address and recount. (By the way, this is why I store and work with encoded IP addresses, it makes string operations much easier…) This time I will limit the result set to the top 5:
+--------+----------+ | log_ip | count(*) | +--------+----------+ | 3a41ef | 1944 | | 57766a | 94 | | 577666 | 90 | | NULL | 81 | | d13e0d | 70 | +--------+----------+
All I can say is
Nearly two thousand blocked registration attempts from one IP address range. If ever there was an argument for a server-wide IP ban, this might be it.
What About Other Boards?
When I see something like this, I tend to move on to other boards and see if they exhibit the same behavior. Here are the top five IP addresses (first six characters) for another board:
+--------+----------+ | log_ip | count(*) | +--------+----------+ | c30272 | 115 | | 57766a | 104 | | 577666 | 98 | | 5042fa | 70 | | 577674 | 59 | +--------+----------+
Two of the top five match between the two boards. And here are the values from a third board:
+--------+----------+ | log_ip | count(*) | +--------+----------+ | 5042fa | 127 | | 57766a | 70 | | cbdf99 | 58 | | 577676 | 45 | | 3d9922 | 40 | +--------+----------+
Hmm. There is 57766a again. So that’s three boards where that IP address was used.
Who Are These People?
At this point I have decided to do further investigation on 57766a and 3a41ef, so I need to decode their IP address information. Decoding 57766a gives me 87.118.106 and decoding 3a41ef gives me 58.65.239. Since the second address has by far the highest number of attempts, I will start there.
I first went to the ARIN “whois” database and entered the IP address. I can’t say I was surprised to find out that the IP address is registered to the Asia-Pacific region. As I mentioned in a prior post, I am seeing an upswing in spammer registrations using .cn (China) domain names.
The search results from ARIN pointed me to APNIC and so off I went. I entered the same IP address search there and got this:
inetnum: 126.96.36.199 - 188.8.131.52 netname: HOSTFRESH descr: HostFresh descr: Internet Service Provider country: HK
So now I know that this IP address range belongs to a Hong Kong ISP named HostFresh. Hm. Have I seen that before? I wasn’t sure, so I did a search and found all sorts of complaints posted about them. I guess I am not unique. They have an abuse email address, but I am willing to bet that complaints sent in that direction go nowhere, so I didn’t bother.
What about the other IP address? Again, I started with ARIN, and again it pointed me somewhere else. This time, however, I was sent to the European database rather than Asia-Pacific. From there, I found that the owner of this range of IP addresses is from Germany:
inetnum: 184.108.40.206 - 220.127.116.11 netname: DE-KEYWEB-III descr: Keyweb AG IP Network country: DE
Given that this range of IP addresses shows up in three different logs, I am assuming that once again I am not unique in being targetted by this group. After searching the web I found that not only is this IP range known for spamming, they are also the source of a large number of web-scraping attacks as well. If you’re not familiar with web-scraping attacks what they do is visit your site and capture all of your content. In some cases, they will then launch a similar site with all of your content. This dilutes the value of your content and can even impact your weighting in search engine results. So that’s a bad thing.
I have recently taken steps to block known screen-scaper bots from my boards, but that’s probably another post.
Any Legitimate Users?
Before I take strong action against these IP ranges, I felt like I should look and see if there are any legitimate users from the same locations. So back to my big board I go to examine the posting history from these two IP ranges. First I checked 3a41ef and here is the result:
mysql> select poster_ip , count(*) from POSTS_TABLE where poster_ip like '3a41ef%' group by 1; Empty set (0.73 sec)
select poster_ip , count(*) from POSTS_TABLE where poster_ip like '57766%' group by 1; Empty set (0.69 sec)
I dropped a letter off of that last IP check on purpose to broaden the range somewhat. I am getting ready to do a server-wide ban on both of these IP ranges.
- ARIN (North America) IP Whois Lookup
- APNIC – Asia Pacific Network Information Center
- RIPE NCC Search Page
Note that if you start with ARIN they will point you to the proper search database for your next query. I have provided the specific links in this case just for convenience.