2.2.5. Sphinx

Attention!

The section of the control panel for working with the service is under development. The actual appearance and functionality in some places may differ from those described in this manual.

Payment

On a virtual hosting service is paid daily with personal balance, on the businesshosting included in the fare.

Sphinx (SQL Phrase Index) is a full—text search system with support for the morphology of various languages. Allows you to quickly and flexibly search for information in the database using arbitrary text.

  1. Open the section "Sphinx".
  2. Click "To order" in the block "Main data":
  3. Wait approximately 15 minutes for the service to activate.
  4. Customize Sphinx.
You can check if Sphinx is working with test script.

Sphinx Version

To find out the version of Sphinx, connect to the hosting via SSH and run the command /usr/local/sphinx/bin/searchd -v.

To use Sphinx on a site, you need to configure both Sphinx itself and the site.

After ordering Sphinx, two tabs appear on the page of this additional service:

The tab contains rules describing what data and how Sphinx should work. By default, a template with an example is substituted into the configuration:

The configuration consists of two sections:

  • Data source (source) - configuration of the data source and its name (in the example db_source):
    • Data type (type) — data source type (in the example mysql).
    • Database connection data (sql_host, sql_port, sql_user, sql_pass, sql_db) — credentials for connection to MySQL databasewhere the information for indexing will come from.
    • Preliminary request (sql_query_pre) - a request that will be before the main request for receiving data from the database (in the example SET NAMES utf8 - setting UTF-8 encoding).
    • The main query (sql_query) - a request to get the necessary data from the database for indexing (in the example select id, title, content from table — columns id, title and content from the table table).
    • Other directives - allow you to define the order of grouping, filtering, sorting (detailed information can be found in official documentation).
  • Indexing options (index) — the configuration of the index and its name (in the example test_index):
    • Data source for indexing (source) - the name of the data source, where the information for indexing will come from (in the example db_source - see above).
    • The path where the index data will be stored (path) - the absolute path to the file with indices (in the example /home/example/.system/sphinx/test_index).
    • Morphology settings (morphology) — the name of the library (in the example stem_ru), which will be used to search for a word with different word forms — for example, by request "dog" results with variants will be returned "dogs", "dog", "dogs" etc.
    • Saving words in the index in their original form (index_exact_words) — when used in conjunction with the directive expand_keywords allows you to return more relevant results on request (in the example 1 — included).
    • The minimum word length for indexing (min_word_len) — used by default 1, but words of this length usually do not carry a semantic load (in the example 3).
    • Other directives - allow you to define the order of grouping, filtering, sorting (detailed information can be found in official documentation).

To enable lemmatization support, you need to place the files of the necessary dictionaries in the hosting account and add the section to the configuration:

common {
    lemmatizer_base = /home/example/path/to/sphinx/dicts/
}

A detailed description of how Sphinx works and a list of all its parameters can be found in official documentation (English).

Below you can:

  • Select the protocol for accessing the Sphinx search daemon (searchd):
    • "MySQL" — Access via SphinxQL, a kind of SQL similar to MySQL.
    • "Sphinx" — Access via SphinxAPI, Sphinx's own API.
  • Enable or disable logging of requests to Sphinx - query_log.

After clicking "Save configuration":

  • Sphinx will restart.
  • If the configuration finds partitions describing disk search indexes:
    • Separate cron tasks will be automatically created for them on the tab "Indexes" with a frequency of updates every 15 minutes.
    • Index creation will start immediately after saving the configuration file (it may take some time).
The tab contains a list of added indexes:

Columns:

  • "Index" Is the name of the index. Indicated in configuration in the section with indexing parameters.
  • "Update frequency" — the current schedule for starting indexing in the standard cron format. By default, indexing starts every 15 minutes.
  • "The size" Is the size of the index. The data is cached for 10 minutes.
  • Buttons:
    • Pause / start — button to pause / start scheduled indexing.
    • Edit — the button for changing the indexing start schedule.
    • Delete — button to delete the index. Attention! After clicking from configuration the section with indexing parameters for this index will be deleted.

Attention!

Before using plugins to work with Sphinx, carefully read their requirements and compatibility. Most plugins have not been supported for a long time and work with Sphinx version 2.2 or lower, the hosting uses version 3.

Setting up a site to work with Sphinx is in the competence of the site developer or involved third-party specialists.

The task comes down to the following steps:

  1. Examine the content of the site database and determine which data you need a quick search for.
  2. Configure Sphinx with the required parameters: specify from which database you want to get data, which tables and columns you want to index, set the indexing rules.
  3. Add code to the site that, instead of the standard database search, will search for data in the created Sphinx index. The code can be written by the developer either onone'sown, or ready-made plugins or modules mentioned on official website.

The site connects to Sphinx over a socket. The socket path can be found on the Sphinx control page. The port should be specified depending on the configuration of the plug-in being connected; in general, the port can be neglected by specifying it as 0 or 9312.

You can view the log in several ways:

  • Control panel: the latest log entries are displayed in real time in the block "Application logs" in section "Sphinx".
  • File manager: press "Open in file manager" in the block "Main data" in section "Sphinx" - the log will open in the built-in editor of the file manager.
  • Console: connect to the hosting via SSH and run the required command:
    • View full log:
      cat ~/.system/sphinx/searchd.log
    • Log monitoring in real time:
      tail -f ~/.system/sphinx/searchd.log

      To finish, use a combination Ctrl+C.

Log files searchd.log and query.log (connection error log and request log) can be painlessly deleted if you do not need information from them.
Content