Introduction
Elasticsearch provides powerful full-text search capabilities that go far beyond simple database LIKE queries. Built on Apache Lucene, it offers relevance scoring, fuzzy matching, aggregations, and near real-time search. This guide covers practical Elasticsearch implementation for common search requirements.
Setup
Docker Installation
# docker-compose.yml
version: '3.8'
services:
elasticsearch:
image: docker.elastic.co/elasticsearch/elasticsearch:8.11.0
environment:
- discovery.type=single-node
- xpack.security.enabled=false
- "ES_JAVA_OPTS=-Xms512m -Xmx512m"
ports:
- "9200:9200"
volumes:
- esdata:/usr/share/elasticsearch/data
kibana:
image: docker.elastic.co/kibana/kibana:8.11.0
ports:
- "5601:5601"
environment:
- ELASTICSEARCH_HOSTS=http://elasticsearch:9200
volumes:
esdata:
Verify Installation
curl http://localhost:9200
Index Management
Create Index with Mapping
curl -X PUT "localhost:9200/products" -H 'Content-Type: application/json' -d'
{
"settings": {
"number_of_shards": 1,
"number_of_replicas": 0,
"analysis": {
"analyzer": {
"custom_analyzer": {
"type": "custom",
"tokenizer": "standard",
"filter": ["lowercase", "asciifolding", "snowball"]
}
}
}
},
"mappings": {
"properties": {
"name": {
"type": "text",
"analyzer": "custom_analyzer",
"fields": {
"keyword": { "type": "keyword" }
}
},
"description": {
"type": "text",
"analyzer": "custom_analyzer"
},
"category": {
"type": "keyword"
},
"price": {
"type": "float"
},
"in_stock": {
"type": "boolean"
},
"created_at": {
"type": "date"
},
"tags": {
"type": "keyword"
}
}
}
}'
Field Types
| Type | Use Case |
|---|
| text | Full-text search (analyzed) |
| keyword | Exact match, sorting, aggregations |
| integer/long/float | Numbers |
| date | Dates |
| boolean | True/false |
| nested | Arrays of objects |
Document Operations
Index Documents
# Single document
curl -X POST "localhost:9200/products/_doc/1" -H 'Content-Type: application/json' -d'
{
"name": "Wireless Bluetooth Headphones",
"description": "Premium noise-canceling headphones with 30-hour battery",
"category": "electronics",
"price": 199.99,
"in_stock": true,
"tags": ["audio", "wireless", "premium"],
"created_at": "2024-01-15"
}'
# Bulk indexing
curl -X POST "localhost:9200/_bulk" -H 'Content-Type: application/json' -d'
{"index": {"_index": "products", "_id": "2"}}
{"name": "USB-C Charging Cable", "category": "accessories", "price": 19.99}
{"index": {"_index": "products", "_id": "3"}}
{"name": "Laptop Stand", "category": "accessories", "price": 49.99}
'
Update Documents
# Partial update
curl -X POST "localhost:9200/products/_update/1" -H 'Content-Type: application/json' -d'
{
"doc": {
"price": 179.99,
"in_stock": false
}
}'
# Update with script
curl -X POST "localhost:9200/products/_update/1" -H 'Content-Type: application/json' -d'
{
"script": {
"source": "ctx._source.price -= params.discount",
"params": { "discount": 20 }
}
}'
Search Queries
Basic Search
# Match query (analyzed)
curl -X GET "localhost:9200/products/_search" -H 'Content-Type: application/json' -d'
{
"query": {
"match": {
"name": "wireless headphones"
}
}
}'
# Multi-match (search multiple fields)
curl -X GET "localhost:9200/products/_search" -H 'Content-Type: application/json' -d'
{
"query": {
"multi_match": {
"query": "wireless audio",
"fields": ["name^2", "description", "tags"],
"type": "best_fields"
}
}
}'
Boolean Queries
curl -X GET "localhost:9200/products/_search" -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"must": [
{ "match": { "name": "headphones" } }
],
"filter": [
{ "term": { "category": "electronics" } },
{ "range": { "price": { "lte": 300 } } },
{ "term": { "in_stock": true } }
],
"should": [
{ "term": { "tags": "premium" } }
],
"must_not": [
{ "term": { "tags": "refurbished" } }
]
}
}
}'
Fuzzy Search
# Handles typos
curl -X GET "localhost:9200/products/_search" -H 'Content-Type: application/json' -d'
{
"query": {
"fuzzy": {
"name": {
"value": "headpohnes",
"fuzziness": "AUTO"
}
}
}
}'
# Match with fuzziness
curl -X GET "localhost:9200/products/_search" -H 'Content-Type: application/json' -d'
{
"query": {
"match": {
"name": {
"query": "wireles headpohnes",
"fuzziness": "AUTO"
}
}
}
}'
Autocomplete with Prefix
curl -X GET "localhost:9200/products/_search" -H 'Content-Type: application/json' -d'
{
"query": {
"prefix": {
"name.keyword": "Wire"
}
}
}'
# Better: Use the completion suggester for autocomplete.
Aggregations
Terms Aggregation
curl -X GET "localhost:9200/products/_search" -H 'Content-Type: application/json' -d'
{
"size": 0,
"aggs": {
"categories": {
"terms": {
"field": "category",
"size": 10
}
}
}
}'
Range Aggregation
curl -X GET "localhost:9200/products/_search" -H 'Content-Type: application/json' -d'
{
"size": 0,
"aggs": {
"price_ranges": {
"range": {
"field": "price",
"ranges": [
{ "to": 50 },
{ "from": 50, "to": 100 },
{ "from": 100, "to": 200 },
{ "from": 200 }
]
}
}
}
}'
Nested Aggregations
curl -X GET "localhost:9200/products/_search" -H 'Content-Type: application/json' -d'
{
"size": 0,
"aggs": {
"categories": {
"terms": { "field": "category" },
"aggs": {
"avg_price": { "avg": { "field": "price" } },
"price_stats": { "stats": { "field": "price" } }
}
}
}
}'
PHP Integration
composer require elasticsearch/elasticsearch
<?php
use Elasticsearch\ClientBuilder;
$client = ClientBuilder::create()
->setHosts(['localhost:9200'])
->build();
// Index document
$params = [
'index' => 'products',
'id' => '1',
'body' => [
'name' => 'Wireless Headphones',
'price' => 199.99,
'category' => 'electronics',
],
];
$client->index($params);
// Search
$params = [
'index' => 'products',
'body' => [
'query' => [
'bool' => [
'must' => [
['match' => ['name' => $searchQuery]],
],
'filter' => [
['term' => ['category' => $category]],
['range' => ['price' => ['lte' => $maxPrice]]],
],
],
],
'from' => $offset,
'size' => $limit,
'sort' => [
['_score' => 'desc'],
['created_at' => 'desc'],
],
],
];
$response = $client->search($params);
foreach ($response['hits']['hits'] as $hit) {
$product = $hit['_source'];
$score = $hit['_score'];
echo "{$product['name']} (score: {$score})\n";
}
// Total results
$total = $response['hits']['total']['value'];
Node.js Integration
const { Client } = require('@elastic/elasticsearch');
const client = new Client({ node: 'http://localhost:9200' });
// Index document
await client.index({
index: 'products',
id: '1',
document: {
name: 'Wireless Headphones',
price: 199.99,
category: 'electronics',
},
});
// Search
const result = await client.search({
index: 'products',
query: {
bool: {
must: [
{ match: { name: searchQuery } },
],
filter: [
{ term: { category: 'electronics' } },
{ range: { price: { lte: 300 } } },
],
},
},
from: 0,
size: 10,
sort: [
{ _score: 'desc' },
{ created_at: 'desc' },
],
});
result.hits.hits.forEach((hit) => {
console.log(hit._source.name, hit._score);
});
Syncing with Database
Event-Driven Sync
// After database insert/update
class ProductObserver
{
private $elasticsearch;
public function saved(Product $product)
{
$this->elasticsearch->index([
'index' => 'products',
'id' => $product->id,
'body' => [
'name' => $product->name,
'description' => $product->description,
'price' => $product->price,
'category' => $product->category->name,
'updated_at' => $product->updated_at->toIso8601String(),
],
]);
}
public function deleted(Product $product)
{
$this->elasticsearch->delete([
'index' => 'products',
'id' => $product->id,
]);
}
}
Bulk Reindex Command
// Artisan command
class ReindexProducts extends Command
{
protected $signature = 'search:reindex';
public function handle()
{
// Delete and recreate index
$this->elasticsearch->indices()->delete(['index' => 'products']);
$this->createIndex();
// Bulk index
$products = Product::with('category')->cursor();
$batch = [];
foreach ($products as $product) {
$batch[] = ['index' => ['_index' => 'products', '_id' => $product->id]];
$batch[] = $this->formatProduct($product);
if (count($batch) >= 1000) {
$this->elasticsearch->bulk(['body' => $batch]);
$batch = [];
}
}
if (!empty($batch)) {
$this->elasticsearch->bulk(['body' => $batch]);
}
}
}
Performance Tips
Index Settings
{
"settings": {
"index": {
"refresh_interval": "30s",
"number_of_replicas": 1
}
}
}
Bulk Operations
// Use the bulk API for multiple documents
$params = ['body' => []];
foreach ($products as $product) {
$params['body'][] = ['index' => ['_index' => 'products', '_id' => $product->id]];
$params['body'][] = $product->toSearchArray();
if (count($params['body']) >= 2000) {
$client->bulk($params);
$params['body'] = [];
}
}
if (!empty($params['body'])) {
$client->bulk($params);
}
Query Optimization
- Use filters for exact matches (cacheable)
- Limit returned fields with
_source - Use pagination with
from/size - Avoid deep pagination - use
search_after instead
Best Practices
- Design mappings carefully - changing later is expensive
- Use appropriate analyzers for your language
- Keep data in sync with your primary database
- Monitor cluster health and disk space
- Use aliases for zero-downtime reindexing
- Test queries with realistic data volumes
Conclusion
Elasticsearch provides powerful search capabilities that enhance user experience. Design your mappings thoughtfully, use appropriate analyzers, and keep data synchronized with your primary database. The patterns in this guide help you implement effective full-text search in your applications.