When using large indexes, the index would sometimes just seem to disappear or no longer be available. I'd begun to make a paranoid amount of backups just in case I had to restore one that performed a vanishing act. Turns out it seems to be a number of open files problem. The operating system limits the number of open files a process or a session can have for security reasons. Search engines based on Lucene tend to have many small files open at the same time as the index is split into many small documents.
On Solr, I'd encountered the open files problem before when it would reach the limit and then stops performing searches until some files are closed. For Elastic Search, performing a before and after disappearing index, it seemed that the open file limit was reach and some files were simply removed, perhaps in an attempt to self-repair a failure condition.
Anyway, since increasing the file limits, my files have stopped disappearing. There are two types of limits: soft file limits and hard file limits.
On Linux (and MacOS), you can check the open file limits on the command line like this for hard:
ulimit -Hn
And this for soft limit:
ulimit -Sn
You can see them all like this:
ulimit -a
On Linux, you need to have sudo priveleges and you can edit /etc/security/limits.sh to increase the limits. I set my soft at 8192 files and my hard at 65536. Don't know if that is too little or too much, but haven't seen the problem again.



Del.ici.ous


