Home

Your premium source for custom modification services for phpBB
  logo

HomeForumsBlogMOD ManagerFAQSearchRegisterLogin

Comments November 16, 2006

Two down…

Filed under: MOD Writing — Dave Rathbun @ 6:37 am CommentsComments (6) 

Last night I closed another BETA and declared “Custom Search Flood Controls” to be a release candidate. I also submitted it to the MOD Team for approval. The hardest part? Deciding which category to put it in.

First, a brief (for me) background on what it does. In 2.0.20 the phpBB Group introduced the concept of Search Flood Control. We already had posting flood control, which keeps people (or posting “bots”) from posting over and over again with very little delay. You may have noticed that “posting” flood control also impacts edits. The reason is simple… someone can put just as much load on the server by posting a very large post and then immediately editing and saving it over and over again as they would by posting. So it makes sense.

What did not make sense to me was the inclusion of a new flood control option for search without more options. For example, a search for unanswered posts really doesn’t impact a server very much, it simply looks for topics that have no replies. That search only hits one table, and while it could be run over and over (and over and over) it really doesn’t have the impact on your server that an extensive keyword search would have.

Here’s the simple query for doing an “unanswered” search:

$sql = “SELECT topic_id
FROM ” . TOPICS_TABLE . ”
WHERE topic_replies = 0
AND topic_moved_id = 0″;

It’s gets a bit more complex if you have forum authorization rules, in fact, it gets more complex than it needs to be since the forum_id is already on the topics table. But I digress. :-) If you search for keywords then we first have to process the “stopwords” and the “synonyms” text files, which means disk I/O. We have to determine if we’re searching the topic title and message, or just the message. Or with a MOD, just the title.

The point is, some searches are worse in terms of server load than others. One of my favorite searches is the “New posts since last visit” as I find it’s a great way to catch up on what I’ve missed. I will click the search, pick a topic, read it – sometimes quite quickly – and then return to search again. Often times I will get denied because of the rapid return.

Or another example, searching for MODs at phpBB.com. I frequently see topics in phpBB Discussion that are really MOD Requests. If I have time, I like to do a MOD Search and post some options in the topic as I move it to MOD Requests. But finding the right set of keywords can be difficult, even if I know what I’m looking for. So I search, get no (or too many) results, and go back to search again.

Only to be told I have to wait. :mad:

But again I digress.

The point of this MOD is to provide the board owner / administrator with an additional level of control and flexibility as to how they want to manage the various searches performed on their board. There are keyword searches (heavy load) and author searches (not so heavy). There are “ego searches” (find my posts) and the aforementioned search for unanswered topics. This MOD will let you set different flood control limits for each and every one of them.

Of course if you turn the search flood control off, then all of these limits are ignored as well.

Finally, and here’s the part that I put in just for myself, you can include exemptions for various user levels. So I can set it up so that board administrators or moderators are automatically excluding from any search flood controls. You could even set this up so that guests have limits while registered users do not. That could be a nice perk to advertise as a way to entice some of your lurkers into registering on your board. Options are nice to have, and this MOD definitely provides a nice tweak (in my !($humble) opinion ;-) ) for a board.

But back to the original question: which category should I select when considering this MOD? I ended up submitting it for inclusion in the “Security” category. The main reason is that since the search flood control was introduced as a security measure (most likely, in my opinion, because they wanted that option as part of phpbb.com) and since this MOD enhanced that feature, it would make sense to submit it to that category.

This is the second MOD I have submitted this week, and the third that I currently have in the MOD Queue for review. I have a few more that I will try to get resolved before or during the holidays.

I like this MOD, hope you will too.

The current code can be viewed here.

6 Comments »

  1. This sounds like something of use. Here’s a question: why not move those synonyms and stopwords into the database?

    Comment by damnian — November 16, 2006 @ 10:04 pm

  2. In most cases with static information (like stop words) you want to simply read from disk, rather than go through the overhead of running a query against the database. For example, I have a standard routine that I try to include in my published MODs that caches various options (like Page Permissions) so that I can simply read from disk.

    Think about it this way, when you query a database you are accessing an engine via a control language (SQL) that is essentially designed to go out to the disk and read information. Why not skip the middle man, and read the disk directly? ;-) This works best when you don’t need to filter your I/O, meaning you just want to read the entire contents in a big gulp.

    phpBB3 is supposed to make use of a similar strategy, from what I understand.

    Comment by dave.rathbun — November 16, 2006 @ 10:34 pm

  3. phpBB2 is already capable of caching templates, although the feature hidden inside the contrib folder. You can choose between file and database.

    The problem with synonyms and stopwords, however, is that, unlike with precompiled templates, you don’t know what you’re looking for at the moment. You have to scan the file until you bump into your word (if it’s not there, all for nothing). This is exactly what indexed DB tables were created for.

    Comment by damnian — November 16, 2006 @ 10:42 pm

  4. I would say that template caching and content caching are not exactly the same thing… I’ll use the phpbb_ranks table as an example. Once you set up your board, you probably only rarely change the post count required to ascend to a new rank. But everytime you view someone’s profile, or view a topic, the phpbb_ranks table is queried. And it’s queried in total, meaning the entire table is loaded into an array.

    So it would make sense (and in fact I’ve done this here) to write out the ranks data to a file, and then read it in when I need it.

    The stopwords are used in a similar fashion (the entire file is read into an array). The code doesn’t read the file line by line, looking for a word. Instead it reads the entire file into an array. Hmm, as I look at the code in includes/functions_search.php I can definitely see room for improvement. It doesn’t work the way I though it did. It would seem that they would use the word text as the index to an array, and then use the search words entered by the user to check against the index of the array. But instead they look through the entire array.

    So you’re right, in this case, especially with a large (and growing) stopwords list it might be more efficient to pull that back into the database.

    I’ve started writing a MOD (I know you’re surprised) that allows me to manage the stopwords via an admin page. I’ll have to think about whether to rewrite the search process to use database stopwords rather than a file list and include it in that MOD as well.

    Comment by dave.rathbun — November 16, 2006 @ 10:59 pm

  5. Come on Dave, you’re leaving me with no MOD ideas! :-P

    Comment by damnian — November 16, 2006 @ 11:11 pm

  6. No, you take it. :-) It will be months before I get back to the stopwords manager anyway, if at all. If you want to see what I had done so far I had opened a ALPHA topic at phpbb.com.

    Comment by dave.rathbun — November 17, 2006 @ 1:29 am

RSS feed for comments on this post.

Leave a comment

Tags allowed in comments:
<a href="" title=""> <acronym title=""> <blockquote cite=""> <code> <strong> <em> <u> <sup> <sub> <strike>

Confirm submission by clicking only the marked checkbox:

             *

Powered by WordPress