Joomla Jumpstart

  • Increase font size
  • Default font size
  • Decrease font size
Home Search Engines Elastic Search disappearing indexes

Elastic Search disappearing indexes

E-mail Print PDF
User Rating: / 19
PoorBest 
Elastic Search logoI've been using Elastic Search for a little while now and when I would index a large quantity of documents (3 million or more), occasionally an index would seem to disappear. Very frustrating. The cause was surprising.



When using large indexes, the index would sometimes just seem to disappear or no longer be available. I'd begun to make a paranoid amount of backups just in case I had to restore one that performed a vanishing act. Turns out it seems to be a number of open files problem. The operating system limits the number of open files a process or a session can have for security reasons. Search engines based on Lucene tend to have many small files open at the same time as the index is split into many small documents.

On Solr, I'd encountered the open files problem before when it would reach the limit and then stops performing searches until some files are closed. For Elastic Search, performing a before and after disappearing index, it seemed that the open file limit was reach and some files were simply removed, perhaps in an attempt to self-repair a failure condition.

Anyway, since increasing the file limits, my files have stopped disappearing. There are two types of limits: soft file limits and hard file limits.

On Linux (and MacOS), you can check the open file limits on the command line like this for hard:
    
ulimit -Hn

And this for soft limit:
    
ulimit -Sn
    
You can see them all like this:

ulimit -a

On Linux, you need to have sudo priveleges and you can edit /etc/security/limits.sh to increase the limits. I set my soft at 8192 files and my hard at 65536. Don't know if that is too little or too much, but haven't seen the problem again.
 
ErikRose Kaboom!, says elasticsearch. “Was that supposed to happen?” “Let me check my notes.” :-)
by ErikRose. Link: Twitter for Mac
YannCluchey RT @elasticsearch: 0.19.0.RC3 Released http://t.co/TJsFLBop Mmmm multisearch
by YannCluchey. Link: Flipboard
mrb_bk @Lenary heh, well I imagine you'd want them to be a separate index, and elasticsearch is great at handling all kinds of data.
by mrb_bk. Link: web
doortts RT @otisg: ZFS for Linux available: http://t.co/r2ByeCDR Anyone using it for Hadoop / HDFS? HBase? Solr? ElasticSearch?
by doortts. Link: HootSuite

Google AdSense


Coffee and Cream Publishing