References
SQL commands
Schema management
Data management
- INSERT
- Adds new documents
- UPDATE - Replaces
existing documents with new ones
- UPDATE - Does in-place
update in documents
- DELETE - Deletes
documents
- TRUNCATE TABLE - Deletes all
documents from index
SELECT
- SELECT -
Searches
- EXPLAIN
QUERY - Shows query execution plan without running the query
itself
- SHOW META -
Shows extended information about executed query
- SHOW
PROFILE - Shows profiling information about executed query
- SHOW
PLAN - Shows query execution plan after the query was executed
- SHOW
WARNINGS - Shows warnings from the latest query
Flushing misc things
Real-time index optimization
Importing to a real-time
index
- ATTACH
INDEX - Moves data from a plain index to a real-time index
- IMPORT
TABLE - Imports previously created RT or PQ index into a server
running in RT mode
Replication
Plain index rotate
Transactions
- BEGIN -
Begins a transaction
- COMMIT -
Finishes a transaction
- ROLLBACK
- Rolls back a transaction
CALL
Plugins
Server status
HTTP endpoints
- /sql
- Allows running an SQL statement over HTTP
- /insert
- Inserts a document into a real-time index
- /pq/idx/doc
- Inserts a PQ rule into a percolate index
- /update -
Updates a document in a real-time index
- /replace -
Replaces a document in a real-time index
- /pq/idx/doc/N?refresh=1
- Replaces a PQ rule in a percolate index
- /delete - Deletes a document
in an index
- /bulk -
Perform several insert, update or delete operations in a single
call
- /search -
Performs search
- /pq/idx/search -
Performs reverse search in a percolate index
Common things
Common index settings
Plain index settings
Distributed index settings
RT index settings
Full-text search operators
Functions
Mathematical
- ABS()
- Returns absolute value
- ATAN2() -
Returns arctangent function of two arguments
- BITDOT() -
Returns sum of products of an each bit of a mask multiplied with its
weight
- CEIL() -
Returns smallest integer value greater or equal to the argument
- COS()
- Returns cosine of the argument
- CRC32() -
Returns CRC32 value of the argument
- EXP()
- Returns exponent of the argument
- FIBONACCI()
- Returns the N-th Fibonacci number, where N is the integer
argument
- FLOOR() -
Returns the largest integer value lesser or equal to the argument
- GREATEST()
- Takes JSON/MVA array as the argument and returns the greatest value in
that array
- IDIV() -
Returns result of an integer division of the first argument by the
second argument
- LEAST() -
Takes JSON/MVA array as the argument, and returns the least value in
that array
- LN() -
Returns natural logarithm of the argument
- LOG10() -
Returns common logarithm of the argument
- LOG2() -
Returns binary logarithm of the argument
- MAX()
- Returns the bigger of two arguments
- MIN()
- Returns the smaller of two arguments
- POW()
- Returns the first argument raised to the power of the second
argument
- RAND() -
Returns random float between 0..1
- SIN()
- Returns sine of the argument
- SQRT() -
Returns square root of the argument
Searching and ranking
- BM25F()
- Returns precise BM25F formula value
- EXIST()
- Replaces non-existing columns with default values
- GROUP_CONCAT()
- Produces a comma-separated list of the attribute values of all
documents in the group
- HIGHLIGHT() - Highlights
search results
- MIN_TOP_SORTVAL()
- Returns sort key value of the worst found element in the current top-N
matches
- MIN_TOP_WEIGHT()
- Returns weight of the worst found element in the current top-N
matches
- PACKEDFACTORS()
- Outputs weighting factors
- REMOVE_REPEATS()
- Removes repeated adjusted rows with the same ‘column’ value
- WEIGHT()
- Returns fulltext match score
- ZONESPANLIST()
- Returns pairs of matched zone spans
- QUERY()
- Returns current full-text query
Type casting
- BIGINT() -
Forcibly promotes the integer argument to 64-bit type
- DOUBLE() -
Forcibly promotes given argument to floating point type
- INTEGER()
- Forcibly promotes given argument to 64-bit signed type
- TO_STRING()
- Forcibly promotes the argument to string type
- UINT() -
Forcibly reinterprets given argument to 64-bit unsigned type
- SINT() -
Interprets 32-bit unsigned integer as signed 64-bit integer
Arrays and conditions
- ALL()
- Returns 1 if condition is true for all elements in the array
- ANY()
- Returns 1 if condition is true for any element in the array
- CONTAINS()
- Checks whether the (x,y) point is within the given polygon
- IF()
- Checks whether the 1st argument is equal to 0.0, returns the 2nd
argument if it is not zero or the 3rd one when it is
- IN()
- Returns 1 if the first argument is equal to any of the other
arguments, or 0 otherwise
- INDEXOF()
- Iterates through all elements in the array and returns index of the
first matching element
- INTERVAL()
- Returns index of the argument that is less than the first
argument
- LENGTH()
- Returns number of elements in MVA
- REMAP()
- Allows to make some exceptions of expression values depending on the
condition values
Date and time
- NOW()
- Returns current timestamp as an INTEGER
- CURTIME()
- Returns current time in local timezone
- UTC_TIME()
- Returns current time in UTC timezone
- UTC_TIMESTAMP()
- Returns current date/time in UTC timezone
- SECOND()
- Returns integer second from the timestamp argument
- MINUTE()
- Returns integer minute from the timestamp argument
- HOUR() -
Returns integer hour from the timestamp argument
- DAY()
- Returns integer day from the timestamp argument
- MONTH() -
Returns integer month from the timestamp argument
- YEAR() -
Returns integer year from the timestamp argument
- YEARMONTH()
- Returns integer year and month code from the timestamp argument
- YEARMONTHDAY()
- Returns integer year, month and day code from the timestamp
argument
- TIMEDIFF()
- Returns difference between the timstamps
Geo-spatial
- GEODIST()
- Computes geosphere distance between two given points
- GEOPOLY2D()
- Creates a polygon that takes in account the Earth’s curvature
- POLY2D() -
Creates a simple polygon in plain space
String
- CONCAT()
- Concatenates two or more strings
- REGEX() -
Returns 1 if regular expression matched to string of attribute and 0
otherwise
- SNIPPET() -
Highlights search results
- SUBSTRING_INDEX()
- Returns a substring of the string before the specified number of
delimiter occurs
Other
- LAST_INSERT_ID()
- Returns ids of documents inserted or replaced by last statement in the
current session
Common settings in
configuration file
To be put to section common {} in configuration file: *
lemmatizer_base -
Lemmatizer dictionaries base path * progressive_merge
- Defines order of merging disk chunks in a real-time index * json_autoconv_keynames
- Whether and how to auto-convert key names within JSON attributes * json_autoconv_numbers
- Automatically detects and converts possible JSON strings that
represent numbers into numeric attributes * on_json_attr_error
- What to do if JSON format errors are found * plugin_dir - Location
for the dynamic libraries and UDFs
indexer is a tool to create plain
indexes
Indexer settings in
configuration file
To be put to section indexer {} in configuration file: *
lemmatizer_cache
- Lemmatizer cache size * max_file_field_buffer
- Maximum file field adaptive buffer size * max_iops
- Maximum indexation I/O operations per second * max_iosize
- Maximum allowed I/O operation size * max_xmlpipe2_field
- Maximum allowed field size for XMLpipe2 source type * mem_limit
- Indexing RAM usage limit * on_file_field_error
- How to handle IO errors in file fields * write_buffer
- Write buffer size * ignore_non_plain
- To ignore warnings about non-plain indexes
Indexer start parameters
indexer [OPTIONS] [indexname1 [indexname2 [...]]]
- –all
- Rebuilds all indexes from the config
- –buildstops
- Reviews the index source, as if it were indexing the data, and
produces a list of the terms that are being indexed.
- –buildfreqs
- Adds the quantity present in the index for –buildstops
- –config,
-c - Path to configuration file
- –dump-rows
- Dumps rows fetched by SQL source(s) into the specified file
- –help
- Lists all the parameters
- –keep-attrs
- Allows to reuse existing attributes on reindexing
- –keep-attrs-names
- Allows to specify attributes to reuse from the existing index
- –merge-dst-range
- Runs the filter range given upon merging
- –merge-killlists
- Changes the way kill lists are processed when merging indexes
- –merge
- Merges two plain indexes into one
- –nohup
- Indexer won’t send SIGHUP if this option is on
- –noprogress
- Prevents displaying progress details
- –print-queries
- Prints out SQL queries that indexer sends to the database
- –print-rt
- Outputs data fetched from sql source(s) as INSERTs to a real-time
index
- –quiet
- Prevents displaying anything
- –rotate
- Forces indexes rotation after all the indexes are built
- –sighup-each
- Forces rotation of each index after it’s built
- -v
- Shows indexer version
Index converter
from Manticore v2 / Sphinx v2
index_converter is a tool for converting indexes created
with Sphinx/Manticore Search 2.x to Manticore Search 3.x index
format.
index_converter {--config /path/to/config|--path}
Index converter start
parameters
- –config,
-c - Path to indexes configuration file
- –index
- Specifies which index should be converted
- –path
- Defines path containing index(es) instead of the configuration
file
- –strip-path
- Strips path from filenames referenced by index
- –large-docid
- Allows to convert documents with ids larger than 2^63
- –output-dir
- Writes the new files in a chosen folder
- –all
- Converts all indexes from the configuration file / path
- –killlist-target
- Sets the target indexes for which kill-lists will be applied
searchd is a Manticore server.
Searchd settings in a
configuration file
To be put to section searchd {} in configuration file: *
access_blob_attrs
- Specifies how index’s blob attributes file is accessed * access_doclists
- Specifies how index’s doclists file is accessed * access_hitlists
- Specifies how index’s hitlists file is accessed * access_plain_attrs
- Specifies how search server will access index’s plain attributes * agent_connect_timeout
- Remote agent connection timeout * agent_query_timeout
- Remote agent query timeout * agent_retry_count
- Specifies how many times Manticore will try to connect and query
remote agents * agent_retry_delay
- Specifies the delay before retrying to query a remote agent in case it
fails * attr_flush_period
- Defines time period between flushing updated attributes to disk * binlog_flush -
Binary log transaction flush/sync mode * binlog_max_log_size
- Maximum binary log file size * binlog_path - Binary
log files path * client_timeout
- Maximum time to wait between requests when using persistent
connections * collation_libc_locale
- Server libc locale * collation_server
- Default server collation * data_dir - Path to data
directory where Manticore stores everything (RT
mode) * docstore_cache_size
- Maximum size of document blocks from document storage that are held in
memory * expansion_limit
- Maximum number of expanded keywords for a single wildcard * grouping_in_utc
- Turns on using UTC timezone where grouping time fields * ha_period_karma
- Agent mirror statistics window size * ha_ping_interval
- Interval between agent mirror pings * hostname_lookup
- Hostnames renew strategy * jobs_queue_size
- Defines how many “jobs” can be in the queue at the same time * listen - Specifies IP
address and port or Unix-domain socket path, that searchd will listen on
* listen_backlog -
TCP listen backlog * listen_tfo
- Allows TCP_FASTOPEN flag for all listeners * log - Path to Manticore
server log file * max_batch_queries
- Limits the amount of queries per batch * max_connections
- Maximum amount of active connections * max_filters -
Maximum allowed per-query filter count * max_filter_values
- Maximum allowed per-filter values count * max_open_files -
Maximum num of files which allowed to be opened by server * max_packet_size
- Maximum allowed network packet size * mysql_version_string
- Server version string to return via MySQL protocol * net_throttle_accept
- Defines how many clients are accepted on each iteration of the network
loop * net_throttle_action
- Defines how many requests are processed on each iteration of the
network loop * net_wait_tm -
Controls busy loop interval of a network thread * net_workers - Number
of network threads * network_timeout
- Network timeout for requests from clients * node_address -
Specifies network address of the node * persistent_connections_limit
- Maximum number of simultaneous persistent connections to remote
persistent agents * pid_file - Path to
Manticore server pid file * predicted_time_costs
- Costs for the query time prediction model * preopen_indexes
- Whether to forcibly preopen all indexes on startup * qcache_max_bytes
- Maximum RAM allocated for cached result sets * qcache_thresh_msec
- Minimum wall time threshold for a query result to be cached * qcache_ttl_sec -
Expiration period for a cached result set * query_log - Path to
query log file * query_log_format
- Query log format * query_log_min_msec
- Prevents logging too fast queries * query_log_mode -
Query log file permissions mode * read_buffer_docs
- Per-keyword read buffer size for document lists * read_buffer_hits
- Per-keyword read buffer size for hit lists * read_unhinted -
Unhinted read size * rt_flush_period
- How often Manticore flush real-time indexes’ RAM chunks to disk * rt_merge_iops -
Maximum number of I/O operations (per second) that real-time chunks
merging thread is allowed to do * rt_merge_maxiosize
- Maximum size of an I/O operation that real-time chunks merging thread
is allowed to do * seamless_rotate
- Prevents searchd stalls while rotating indexes with huge amounts of
data to precache * server_id - Server
identifier used as a seed to generate a unique document ID * shutdown_timeout
- Searchd --stopwait timeout * shutdown_token -
SHA1 hash of the password required to invoke shutdown
command from VIP SQL connection * snippets_file_prefix
- Prefix to prepend to the local file names when generating snippets in
load_files mode * sphinxql_state -
Path to file where current SQL state will be serialized * sphinxql_timeout
- Maximum time to wait between requests from a mysql client * ssl_ca - Path to SSL
Certificate Authority certificate file * ssl_cert - Path to
server’s SSL certificate * ssl_key - Path to SSL
certificate key of the server * subtree_docs_cache
- Maximum common subtree document cache size * subtree_hits_cache
- Maximum common subtree hit cache size, per-query * thread_stack -
Maximum stack size for a job * unlink_old - Whether
to unlink .old index copies on successful rotation * watchdog - Whether to
enable or disable Manticore server watchdog
Searchd start parameters
- –config,
-c - Path to configuration file
- –console
- Forces running in console mode
- –coredump
- Enables saving core dump on crash
- –cpustats
- Enables CPU time reporting
- –delete
- Removes Manticore service from Microsoft Management Console and other
places where the services are registered
- –force-preread
- Forbids the server to serve any incoming connection until pre-reading
of the index files completes
- –help,
-h - Lists all the parameters
- –index
- Forces serving only the specified index
- –install
- Installs searchd as a service into Microsoft Management Console
- –iostats
- Enables input/output reporting
- –listen,
-l - Overrides listen from the
configuration file
- –logdebug,
–logdebugv, –logdebugvv - Enables additional debug output in the
server log
- –logreplication
- Enables additional replication debug output in the server log
- –new-cluster
- Bootstraps a replication cluster and makes the server a reference node
with cluster
restart protection
- –new-cluster-force
- Bootstraps a replication cluster and makes the server a reference node
bypassing cluster
restart protection
- –nodetach
- Leaves searchd in foreground
- –ntservice
- Passed by Microsoft Management Console to searchd to invoke it as a
service on Windows platforms
- –pidfile
- Overrides pid_file
from the configuration file
- –port,
p - Specifies port searchd should listen on disregarding the port
specified in the configuration file
- –replay-flags
- Specifies extra binary log replay options
- –servicename
- Applies the given name to searchd when installing or deleting the
service, as would appear in Microsoft Management Console
- –status
- Queries running search to return its status
- –stop
- Stops Manticore server
- –stopwait
- Stops Manticore server gracefully
- –strip-path
- Strips path names from all the file names referenced from the
index
- -v
- shows version information
Miscellaneous index maintenance functionality useful for
troubleshooting.
indextool <command> [options]
Used to dump miscellaneous debug information about the physical
index
indextool <command> [options]
- –config, -c - Path
to configuration file
- –quiet, -q - Keeps
indextool quiet - it will not output banner, etc
- –help, -h - Lists
all the parameters
- -v - Shows version
information
- Indextool -
Verifies configuration file
- –buildidf - Builds
IDF file from one or several dictionary dumps
- –build-infixes -
Build infixes for an existing dict=keywords index
- –dumpheader -
Quickly dumps the provided index header file
- –dumpconfig - Dumps
index definition from the given index header file in almost compliant
manticore.conf file format
- –dumpheader - Dumps
index header by index name with looking up the header path in the
configuration file
- –dumpdict - Dumps
index dictionary
- –dumpdocids - Dumps
document IDs by index name
- –dumphitlist -
Dumps all occurrences of the given keyword/id in the given index
- –fold - Tests
tokenization based on index’s settings
- –htmlstrip -
Filters STDIN using HTML stripper settings for the given index
- –mergeidf - Merges
several .idf files into a single one
- –morph - Applies
morphology to the given STDIN and prints the result to stdout
- –check - Checks the
index data files for consistency
- –check-disk-chunk -
Checks one disk chunk of an RT index
- –strip-path -
Strips path names from all the file names referenced from the index
- –rotate - Defines
whether to check index waiting for rotation in
--check
- –apply-killlists -
Applies kill-lists for all indexes listed in the configuration file
Splits compound words into components.
wordbreaker [-dict path/to/dictionary_file] {split|test|bench}
Wordbreaker start
parameters.
Used to extract contents of a dictionary file that uses ispell or
MySpell format.
spelldump [options] <dictionary> <affix> [result] [locale-name]
- dictionary -
Dictionary’s main file
- affix -
Dictionary’s affix file
- result - Specifies
where the dictionary data should be output to
- locale-name -
Specifies the locale details to use
List of reserved keywords
A complete alphabetical list of keywords that are currently reserved
in Manticore SQL syntax (and therefore can not be used as
identifiers).
AND, AS, BY, DISTINCT, DIV, EXPLAIN, FACET, FALSE, FORCE, FROM, IGNORE, IN, INDEXES, IS, LIMIT, LOGS, MOD, NOT, NULL, OFFSET, OR, ORDER, REGEX, RELOAD, SELECT, SYSFILTERS, TRUE, USE