ITCooky Recipes

Lets cooky it yammy things!

Lets install a super fast search engine Sphinx in to phpBB3 forum!

дата November 10, 2019

Surely the same thing happened to you. You make a website, it grows, Apache can’t take it anymore, you put Nginx, it pulls for a couple of years, then it starts to slow down MySQL, you buy more powerful hosting, it also rolls for a couple of years, but then it starts to slow down again and now In 15 years you discover that you only had to activate a module in PHP and everything would fly even without Sphinx!!! Don’t you?

How to understand what is slow in a phpBB forum?
If you have a forum with a couple of million posts, then it is clear that it will not work normally without optimization, even the creators of phpBB will say that the forum is too large for the search to work in the integrated engiens (by the way, themselves, recommend installing Sphinx in such cases).

You have to run on the server console
top

And look load average show the values in 1, 5 and 15 minutes. In general, it is accepted that the load is normal if it is less than the number of processor cores. 0.5 for one core, 1.5 for two! But here everything is relative, according to my expirience in two cores, the server starts to slow down on 5-7 (generation of pages from 4 seconds), and works constantly for 3-5 (generation of pages 0.4-1.5 seconds)! This is the total load on the server and it is not always clear what should be optimized.

However, phpBB has a diagnostic tool, that is – debuging. In the config.php forum configuration file, uncomment or type

@define('PHPBB_DISPLAY_LOAD_TIME', true);
@define('DEBUG', true);
@define('DEBUG_EXTRA', true);

Now everyone will see at the bottom of the forum a line with information on how long the page was generated, and not how long it has been since the link was clicked and the page appeared, that depends of the connection speed.

And if you navigate the forum under the administrator, there will also be a SQL Explain link to the page where statistics are on how long MySQL took to perform its operations and also PHP

SQL Report

Page generated in 1.0217 seconds with 13 queries + 3 queries returning data from cache

Time spent on mysqli queries: 0.02616s | Time spent on PHP: 0.9955s

It turns out that PHP consumes more time, and not MySQL, as I’ve always thought!

Here begins mysticism. A couple of large forum administrators noticed and told the creators of phpBB that the mbstring module must be activated in PHP for normal forum perfomance, and of course this should be written in the instructions and in the forum requirements, but … The creators keep silent and do nothing! This is rare!
My forum stopped braking when the search load passed to Sphinx. But the search results came out very slowly (sometimes that didn’t happen before timeout), until i have activated mbstring

Lets activate the mbstring PHP module
I have CentOS 6.10 and php 7.2 I have just updated it from CentOS 6.7 and php 5.6 while searching for the reason of few forum perfomace (now of course I only blame PHP)

veiw versión php
php -v
see modules
php -m
the same in another way, show me mbstring if its present
php -m | grep mbstring

Install module
yum install php-mbstring
Restart php-fpm (or what deals with PHP on your web server, apache maybe)
/sbin/service php-fpm restart
And everything goes surprisingly fast, heaven and earth, before and after … if after that your search still slows down, you definitely need Sphinx!

Lets install Sphinx
You can download the third version from the developer site www.sphinxsearch.com but I didn’t understand how to install it!

I will install version 2, it is already in the site archive.

Download it
wget https://sphinxsearch.com/files/sphinx-2.2.11-2.rhel6.i386.rpm
I install what is necessary for Sphinx
yum install postgresql-libs unixODBC
yum install rpm

y mas
rpm -Uhv sphinx-2.2.11-2.rhel6.i386.rpm
make necessary folders
mkdir {/usr/www/sphinx/,/usr/www/sphinx/log}
chown -R sphinx:sphinx /usr/www/sphinx

To configure phpBB, almost everything that is needed is described here
wiki.phpbb.com/Sphinx_Fulltext_Search

We go to the ACP of our phpBB3 forum to GENERAL> Search settings and select Sphinx Fulltext in Serach Backend and write the route where the sphinx is /usr/www/sphinx/

After clicking Submit , a sphinx configuration file will be generated.

CAUTION: here you must look at parameter
mem_limit = 32M
if it is too big it can cause MySQL disconnection

Then you have to go to MANTAINANCE> Search index and click on Sphinx Fulltext (activate) the Create index button.

It will simply create something, it is not indexing yet. Other indexes can be deleted: they only takes up space

The configuration generated by the forum is inserted here
vi /etc/sphinx/sphinx.conf

We change the lines for your data

sql_user = dbuser
sql_pass = dbpassword

Here you must enter a password for your database and user, it can be seen in the forum folder in the config.php file

CAUTION: If there is # in the password, then the sphinx will decide that the next string is not part of the password, but a comment!!! If you password has # it should be written like this
\#

We change the morphology parameter according to your linguistic preferences

morphology = stem_ru

That allows you to search for the same words but that end with different letters. I turned it on, but sometimes it gives strange results: if a word is cut by a space, it also considers it as a letter

That code for the Russian language

charset_table = U+FF10..U+FF19->0..9, 0..9, U+FF41..U+FF5A->a..z, U+FF21..U+FF3A->a..z, A..Z->a..z, a..z, U+0149, U+017F, U+0138, U+00DF, U+00FF, U+00C0..U+00D6->U+00E0..U+00F6, U+00E0..U+00F6, U+00D8..U+00DE->U+00F8..U+00FE, U+00F8..U+00FE, U+0100->U+0101, U+0101, U+0102->U+0103, U+0103, U+0104->U+0105, U+0105, U+0106->U+0107, U+0107, U+0108->U+0109, U+0109, U+010A->U+010B, U+010B, U+010C->U+010D, U+010D, U+010E->U+010F, U+010F, U+0110->U+0111, U+0111, U+0112->U+0113, U+0113, U+0114->U+0115, U+0115, U+0116->U+0117, U+0117, U+0118->U+0119, U+0119, U+011A->U+011B, U+011B, U+011C->U+011D, U+011D, U+011E->U+011F, U+011F, U+0130->U+0131, U+0131, U+0132->U+0133, U+0133, U+0134->U+0135, U+0135, U+0136->U+0137, U+0137, U+0139->U+013A, U+013A, U+013B->U+013C, U+013C, U+013D->U+013E, U+013E, U+013F->U+0140, U+0140, U+0141->U+0142, U+0142, U+0143->U+0144, U+0144, U+0145->U+0146, U+0146, U+0147->U+0148, U+0148, U+014A->U+014B, U+014B, U+014C->U+014D, U+014D, U+014E->U+014F, U+014F, U+0150->U+0151, U+0151, U+0152->U+0153, U+0153, U+0154->U+0155, U+0155, U+0156->U+0157, U+0157, U+0158->U+0159, U+0159, U+015A->U+015B, U+015B, U+015C->U+015D, U+015D, U+015E->U+015F, U+015F, U+0160->U+0161, U+0161, U+0162->U+0163, U+0163, U+0164->U+0165, U+0165, U+0166->U+0167, U+0167, U+0168->U+0169, U+0169, U+016A->U+016B, U+016B, U+016C->U+016D, U+016D, U+016E->U+016F, U+016F, U+0170->U+0171, U+0171, U+0172->U+0173, U+0173, U+0174->U+0175, U+0175, U+0176->U+0177, U+0177, U+0178->U+00FF, U+00FF, U+0179->U+017A, U+017A, U+017B->U+017C, U+017C, U+017D->U+017E, U+017E, U+0410..U+042F->U+0430..U+044F, U+0430..U+044F, U+4E00..U+9FFF, U+0400->U+0435, U+0401->U+0435, U+0402->U+0452, U+0452, U+0403->U+0433, U+0404->U+0454, U+0454, U+0405->U+0455, U+0455, U+0406->U+0456, U+0407->U+0456, U+0457->U+0456, U+0456, U+0408..U+040B->U+0458..U+045B, U+0458..U+045B, U+040C->U+043A, U+040D->U+0438, U+040E->U+0443, U+040F->U+045F, U+045F, U+0450->U+0435, U+0451->U+0435, U+0453->U+0433, U+045C->U+043A, U+045D->U+0438, U+045E->U+0443, U+0460->U+0461, U+0461, U+0462->U+0463, U+0463, U+0464->U+0465, U+0465, U+0466->U+0467, U+0467, U+0468->U+0469, U+0469, U+046A->U+046B, U+046B, U+046C->U+046D, U+046D, U+046E->U+046F, U+046F, U+0470->U+0471, U+0471, U+0472->U+0473, U+0473, U+0474->U+0475, U+0476->U+0475, U+0477->U+0475, U+0475, U+0478->U+0479, U+0479, U+047A->U+047B, U+047B, U+047C->U+047D, U+047D, U+047E->U+047F, U+047F, U+0480->U+0481, U+0481, U+048A->U+0438, U+048B->U+0438, U+048C->U+044C, U+048D->U+044C, U+048E->U+0440, U+048F->U+0440, U+0490->U+0433, U+0491->U+0433, U+0490->U+0433, U+0491->U+0433, U+0492->U+0433, U+0493->U+0433, U+0494->U+0433, U+0495->U+0433, U+0496->U+0436, U+0497->U+0436, U+0498->U+0437, U+0499->U+0437, U+049A->U+043A, U+049B->U+043A, U+049C->U+043A, U+049D->U+043A, U+049E->U+043A, U+049F->U+043A, U+04A0->U+043A, U+04A1->U+043A, U+04A2->U+043D, U+04A3->U+043D, U+04A4->U+043D, U+04A5->U+043D, U+04A6->U+043F, U+04A7->U+043F, U+04A8->U+04A9, U+04A9, U+04AA->U+0441, U+04AB->U+0441, U+04AC->U+0442, U+04AD->U+0442, U+04AE->U+0443, U+04AF->U+0443, U+04B0->U+0443, U+04B1->U+0443, U+04B2->U+0445, U+04B3->U+0445, U+04B4->U+04B5, U+04B5, U+04B6->U+0447, U+04B7->U+0447, U+04B8->U+0447, U+04B9->U+0447, U+04BA->U+04BB, U+04BB, U+04BC->U+04BD, U+04BE->U+04BD, U+04BF->U+04BD, U+04BD, U+04C0->U+04CF, U+04CF, U+04C1->U+0436, U+04C2->U+0436, U+04C3->U+043A, U+04C4->U+043A, U+04C5->U+043B, U+04C6->U+043B, U+04C7->U+043D, U+04C8->U+043D, U+04C9->U+043D, U+04CA->U+043D, U+04CB->U+0447, U+04CC->U+0447, U+04CD->U+043C, U+04CE->U+043C, U+04D0->U+0430, U+04D1->U+0430, U+04D2->U+0430, U+04D3->U+0430, U+04D4->U+00E6, U+04D5->U+00E6, U+04D6->U+0435, U+04D7->U+0435, U+04D8->U+04D9, U+04DA->U+04D9, U+04DB->U+04D9, U+04D9, U+04DC->U+0436, U+04DD->U+0436, U+04DE->U+0437, U+04DF->U+0437, U+04E0->U+04E1, U+04E1, U+04E2->U+0438, U+04E3->U+0438, U+04E4->U+0438, U+04E5->U+0438, U+04E6->U+043E, U+04E7->U+043E, U+04E8->U+043E, U+04E9->U+043E, U+04EA->U+043E, U+04EB->U+043E, U+04EC->U+044D, U+04ED->U+044D, U+04EE->U+0443, U+04EF->U+0443, U+04F0->U+0443, U+04F1->U+0443, U+04F2->U+0443, U+04F3->U+0443, U+04F4->U+0447, U+04F5->U+0447, U+04F6->U+0433, U+04F7->U+0433, U+04F8->U+044B, U+04F9->U+044B, U+04FA->U+0433, U+04FB->U+0433, U+04FC->U+0445, U+04FD->U+0445, U+04FE->U+0445, U+04FF->U+0445, U+0410..U+0418->U+0430..U+0438, U+0419->U+0438, U+0430..U+0438, U+041A..U+042F->U+043A..U+044F, U+043A..U+044F

The forum generates a file where only Latin letters are described; I take the Cyrillic alphabet here www.sphinxsearch.com/wiki/doku.php?id=charset_tables#cyrillic

We start indexing, it happens incredibly fast. It took me 10-15 minutes, if you do a search through MySQL Fulltext, it takes hours!

you have to change in all commands the value 819ad66b94720731 for generated by your forum

indexer --config /etc/sphinx/sphinx.conf index_phpbb_819ad66b94720731_main
indexer --config /etc/sphinx/sphinx.conf index_phpbb_819ad66b94720731_delta
indexer --rotate --config /etc/sphinx/sphinx.conf index_phpbb_819ad66b94720731_delta

starting Sphinx
searchd --config /etc/sphinx/sphinx.conf

To start it in the boot, they say you must do it in CentOS like this
/sbin/chkconfig searchd on
Well, in my case it didn’t work
I add to cron
cron -e
line

@reboot searchd --config /etc/sphinx/sphinx.conf

And you must also add lines to the cron that will index the sphinx search base. Light indexing of only new messages every 5 minutes. And full once a day when everyone sleeps

you have to change in all commands the value 819ad66b94720731 for generated by your forum

*/5 * * * * indexer --rotate --config /etc/sphinx/sphinx.conf index_phpbb_819ad66b94720731_delta >> /usr/www/sphinx/log/indexer.log 2>&1 &
0 14 * * * indexer --rotate --config /etc/sphinx/sphinx.conf index_phpbb_819ad66b94720731_main >> /usr/www/sphinx/log/indexer.log 2>&1 &

And the search Sphinx already works!

By the way: You can run two or more sphinxes, you just have to repeat all the steps here, create conf and more, but change the port to another!

How fast does Sphinx work can be verified like this
tail -5 /usr/www/sphinx/log/sphinx-query.log
And sometimes empty requests appear here, but these are not empty requests, some type of i dont know what

IMPORTANT: if you have initial CentOS, then, of course, SELinux’s terrible security is enabled, sphinx will not work, the forum search will show an error:

Search failed: connection to localhost:9312 failed (errno=13, msg=Permission denied)

You can temporarily disable this security
setenforce 0
And the search will work immediately, but I would like to get to the source … I found it, thanks to Google and the unfortunate people who faced this.
You must do
setsebool -P httpd_can_network_connect 1
that permanently in SELinux disables the necessary parameter

And the Hosting companies in VPS plans in CentOS generally simply disable SELinux by complete, of course who needs the server to be protected :)!

UPD: Sometimes it turns off
Lets make script in folder /usr/local/src/
add text
vi /usr/local/src/restart_search.sh
this one

!#/usr/bin/bash

if netstat -tulpn | grep 9312
then
echo "busy"
else
searchd --config /etc/sphinx/sphinx.conf
fi

If noapp is listining port 9312, run searchd
Add to cron
crontab -e
this

*/10 * * * * /usr/bin/bash /usr/local/src/restart_search.sh/search_re.sh

10 Responses to “Lets install a super fast search engine Sphinx in to phpBB3 forum!”

  1. Tony says:

    Why not Manticore Search (https://manticoresearch.com/manticore-vs-sphinx/)? It’s at least supported and more active than Sphinx.

    • Александр says:

      Thank you, this is a realy interesting comparation – and gives more info about Sphinx/ I realy never heard about Manticore Search/ And us far there is no official support of it on PhpBB3?!

  2. AFAIK it’s indeed not supported officially (perhaps PHPBB guys also don’t know about Manticore), but it may well work fine after you rebuild the index (as the index formats are not compatible since v3). Most of the query syntax is still the same in Sphinx and Manticore. If you give it a try and encounters some issue – let us know at contact@manticoresearch.com

  3. Sergey Nikolaev says:

    oh, that’s interesting. We speak Russian too. May I ask you to provide more details on the test case so we can reproduce and debug it? According to our tests (e.g. https://manticoresearch.com/2019/01/28/manticore-2-7-5-vs-sphinx-3-1-1/ and others, the tests are always fully dockerized so anyone can reproduce them) Manticore is faster. I’ll appreciate if you come to our slack (http://slack.manticoresearch.com/) so we can discuss your test case there. We’re definitely interested in reproducing it and fixing Manticore if it’s slower.

    • Александр says:

      I have update my test, now Manticore and Sphinx have the same speed… I dont know why at the beging sphinx was soo fast – i cant reproduce this speed not in CentOS 6 32 bit VM ni en CentOS 7 64 bit VM

  4. […] a través de Sphinx y estuve satisfecho con todo, pero aquí en la versión en inglés del artículoLets install a super fast search engine Sphinx in to phpBB3 forum! me preguntaron por qué no Manticore – Why not […]

  5. […] through Sphinx and was satisfied with everything, but here in the English version of the articleLets install a super fast search engine Sphinx in to phpBB3 forum! they asked me – Why not […]

Leave a Reply

Your email address will not be published. Required fields are marked *