OpenSearch Tutorial

OpenSearch is an open source fork of ElasticSearch, which has changed its license already. OpenSearch will continue development of the project, and keeping the original license of the project.

In this tutorial I will present you how to install and do some queries on OpenSearch, using Docker and Linux Curl. If this post evolves more, we might also write some Python Scripts. So let's get started.

This post will assume that you have Docker already installed, and that you are using Linux or Mac OS.

Also, it will be helpful to have the tool jq installed as well

Get the Environment Ready

docker pull opensearchproject/opensearch:1.0.1
docker network create network1
mkdir ~/opensearch_data
chown 1000:1000 -R opensearch_data/

With these commands, we pulled the OpenSearch Docker Image. Then we created a Docker network. The network might not be needed now, but if we run some other containers later, we might need them to access the OpenSearch container.

Then we created a directory on the host so that we save all the OpenSearch Server Data. So that in case we remove the container, we don't lose our data.

Running the OpenSearch Project

Now, let's go ahead and start the server.

docker run -d --name opensearch --net network1 -v ~/opensearch_data:/usr/share/opensearch/data -p 9200:9200 -p 9600:9600 -e "discovery.type=single-node" -e "plugins.security.disabled=true" opensearchproject/opensearch:1.0.1

Your server will be listning on the port 9200 on your host machine. So to test our setup, let's run the command

curl http://localhost:9200

You should get a response that contains the server information.

Creating and index and adding some documents

Now let's use curl to add some documents to the index that is called documents.

curl -POST http://localhost:9200/documents/document/1 -curl -H 'Content-Type: application/json' -d '{"title":"Some country in the Middle East", "text": "Amman is the capital of Jordan"}'
curl -POST http://localhost:9200/documents/document/2 -curl -H 'Content-Type: application/json' -d '{"title":"Europe", "text": "Berlin is the capital of Germany"}'

By running these curl commands against the Opensearch Server, will add two documents to the index documents. document is just a document type. At the moment, OpenSearch supports one document type / index.

Let's query the index

First, let's try to query the document that contain some words, for example Amman.

curl -POST http://localhost:9200/documents/document/_search -curl -H 'Content-Type: application/json' -d '{"query": {"query_string": { "query": "amman" }}}' | jq .hits.hits

As you can see, we are passing the output of the request to the jq tool, and we are looking at the json path .hits.hits of the response

[
  {
    "_index": "documents",
    "_type": "document",
    "_id": "1",
    "_score": 0.6931471,
    "_source": {
      "title": "Some country in the Middle East",
      "text": "Amman is the capital of Jordan"
    }
  }
]

But let's say we want to query only 1 field in the document, which is the title for example. In this case, we can use another query.

curl -POST http://localhost:9200/documents/document/_search -curl -H 'Content-Type: application/json' -d '{"query": {"match": { "title": "middle east" }}}' | jq .hits.hits

Examining the data folder

All the data of your index is saved to the folder ~/opensearch_data. You can check it out. In case your container is removed or crashes. You can start it again with the first command line arguments, without losing your index data.

Future Work

I hope you found this small post useful. I will expand this post with more examples in the near future.

  • Add Multi-Match, Boolean and Term queries
  • Adding a basic Python Web Server code using Flask and Docker, that connects to the OpenSearch Server and do some queries on behalf of the user.

References


Homepage