Elasticsearch Installation and a Few Core Concepts

So, I have been having my way with Elasticsearch for a few weeks now and it's time for me to blog about it :) In this post, I will only highlight a few things here which were useful to me at the beginning.
25 September 2014
4 minutes read

So, I have been having my way with Elasticsearch for a few weeks now and it's time for me to blog about it :) I hear what you say :) This is yet another 101 blog post but settle down :) I have a few selfish reasons here. Blogging on a new technology is a way for me to grasp it better. Also, during this time, I have been also looking into Azure Search, an hosted search service solution by Microsoft. You should check this service out, too if what you want is to have an easily scalable hosted search service solution.

OK, what were we talking about? Oh yes, Elasticsearch :) It's awesome! I mean it, seriously. If you haven't looked at it, you should check it out. It's a search product which makes data search and analytics very easy. It's built on top of famous search engine library Apache Lucene. Elasticsearch also has a great documentation. So, I will only highlight a few things here which were useful to me at the beginning.

Setting it Up

To get started with Elasticsearch, you can download it from here. At the time of this writing, version 1.3.2 was the latest version. What you need to do is fairly simple: download the zip file, extract it, navigate to bin directory and run Elasticsearch from the command line.

1-Screenshot 2014-09-25 13.20.48

As we didn't specify any configuration values, Elasticsearch started with the configuration defined inside the /config/elasticsearch.yml file. If you didn't touched that file either, it will be exposed through localhost:9200. Elasticsearch server is exposed to the World through its HTTP API and when you send a GET request to the root level, you will get the server info:

2-Screenshot 2014-09-25 13.23.47

You can also see inside the response that the node even assigned itself a random name: Mole Man in this case. You can start working with this elasticsearch node using your choice of an HTTP client but I really recommend installing Marvel which ships with a developer console that allows you to easily issue calls to Elasticsearch’s HTTP API. To install marvel, you need to run the following command which you are under the elasticsearch/bin path:

plugin -i elasticsearch/marvel/latest

After you are done with the installation, you should restart the server and from there, you can navigate to localhost:9200/_plugin/marvel/sense/index.html on your favorite browser:

3-Screenshot 2014-09-25 13.43.02

Now, you can start issuing request to you Elasticsearch server right from your browser:

4-Screenshot 2014-09-25 13.47.13

The best part of this plugin is its autocomplete support while you are construction queries:

5-Screenshot 2014-09-25 13.49.29

It can even understand and provide you the options for your type fields which is pretty useful. Before going deeper, let's learn e few fundamentals about Elasticsearch.

Basics 101

When working with Elasticsearch, you will come across a few core concepts very often. I think knowing what they are beforehand will help you along the way.

Index is the container for all your documents. Each document inside your index will have a few shared properties. One of those properties is type. Each document inside your index will have a type and each type will have a defined mapping (sort of a schema). The type and its mapping is not required to be defined upfront. Elasticsearch can create the type and its mapping dynamically. However, there are lots of useful things you can achieve with altering the default mapping such as changing the analyzer for a field.

At first glance, it seems that we can only search on types but that's not the case actually. You can even search the entire Elasticsearch node if you want:

6-Screenshot 2014-09-25 14.08.02

I have a few indexes created here and those indexes have documents in one or multiple different types. When I search for a word "madrid"  here, I got 568 hits throughout the node. I have some hits as movie type from movies-test index and some hits as status type from my_twitter_river index. This could be pretty powerful depending on your scenario if you embrace Elasticsearch’s schema free nature. We can also search on one index and that would give us the ability to search for the types under that particular index.

I mentioned that each type has its mapping. You can see the mapping of a type by sending a GET request to /{index}/{type}/_mapping:

7-Screenshot 2014-09-25 17.23.17

Twitter River Plugin for Elasticsearch

If you are just getting started with Elasticsearch, what I recommend for you is to get some data in using the Twitter River Plugin for ElasticSearch. After you configure and run it for a few hours, you will have insane amount of data ready to play with. 

8-Screenshot 2014-09-25 17.28.10