In this post, I would like to talk about Elasticsearch and Postgresql search and their differences.
As you know, we have Elasticsearch implemented on Hive which is being rebranded as Hivesearcher (previously Esteem Search). System has been running for almost 2 years now without a glitch. As of this writing, there is no Hivemind or Postgresql based search engine on Hive yet, so this comparison is theoretical based on our limited tests/knowledge.
Elasticsearch is based on Apache Lucene as a core, adds REST API and few other improvements. Comparison should really be Lucene vs Postgresql.
Lucene is built by Apache just for one purpose, to make scalable, high-performance indexing and full-featured text search engine.
Postgresql is robust, relational database with SQL compliance and focuses on extensibility.
You can also check this page to see their differences. As you can see, they both built for different reasons and for different goals.
When choose Postgresql
We have been using Postgresql on our backend way before Hivemind came to live. And were glad to see Hivemind being written in Python and Postgres which is exactly what we use on our backend stuff for Esteem/Ecency. So we use Postgres for almost everything but not for search. We considered Postgres when building search engine because our backend was in Postgres and Hivemind was also using Postgres, so you would think it would make sense to just use same technology. After careful consideration and long term goals, we choose to go with the best search engine tech which is Elasticsearch. However, if you are web or mobile developer, consider using Postgres, it is one of the best database engine out there for relational data. You won't regret it because of opensource, community size and tools, extensions, it is growing.
When choose Elasticsearch
Elasticsearch can also be used as datasource/database but it is not designed for that purpose and you will fail miserably if you use it as datasource. What Elasticsearch does best is building search engine and index large dataset, analyze queries. So perfect for large and long living content Hive produces.
It is #1 search engine for a reason.
Postgres Full-Text Search bring quite good improvements and can offer good alternative search. But it is very hard to scale and also I don't think, one can perform complex queries like
"esteem surfer" -"monthly digest" -giveaway author:good-karma tag:esteem,wallet type:post. On the other hand, elasticsearch is easy to scale and you can perform any type of complex queries.
Query above searches for posts from @good-karma with two tags together #esteem and #wallet having exact match of "Esteem Surfer" phrase but excluding posts that include "monthly digests" or "giveaways". You cannot simply do this with Postgresql, you will end up creating indexes for very different use cases and end up with large and clunky performant data.
Creating search based on Postgres would require separated tables and indexes, clean body field because it includes html + markdown combine cleaned post body + post title to search a term in multiple fields at the same time.
Adding search into Hivemind is another story altogether because then we might be affecting performance of regular API queries. Postgres uses shared_buffers which means it tries to complete each task in shared buffers. We all know hivemind servers are quite busy with reorganizing active blockchain data and if search is added into it, those search results will take some place in shared buffers. In result, it will reduce hivemind's performance and rpc nodes.
So Postgresql doesn't offer a magic wand nor does Elasticsearch. But tools are used best for their designed purpose.