Quantcast
Channel: Adrien Grand
Browsing latest articles
Browse All 10 View Live

How fast is bit packing?

One of the most anticipated changes in Lucene/Solr 4.0 is its improved memory efficiency. Indeed, according to severalbenchmarks, you could expect a 2/3 reduction in memory use for a Lucene-based...

View Article



What is the theory behind Apache Lucene?

There is a recurring request from users to have more insight into Lucene internals. For example, see: Lucene user mailing-list - lucene algorithm?,StackOverflow - How does Lucene index documents?,Quora...

View Article

Image may be NSFW.
Clik here to view.

Wow, LZ4 is fast!

I’ve been doing some experiments with LZ4 recently and I must admit that I am truly impressed. For those not familiar with LZ4, it is a compression format from the LZ77 family. Compared to other...

View Article

Efficient compressed stored fields with Lucene

Whatever you are storing on disk, everything usually goes perfectly well until your data becomes too large for your I/O cache. Until then, most disk accesses actually never touch disk and are almost as...

View Article

Stored fields compression in Lucene 4.1

Last time, I tried to explain how efficient stored fields compression can help when your index grows larger than your I/O cache. Indeed, magnetic disks are so slow that it is usually worth spending a...

View Article


lz4-java 1.0.0 released

I am happy to announce that I released the first version of lz4-java, version 1.0.0.lz4-java is a Java port of the lz4 compression library and the xxhash hashing library, which are both known for being...

View Article

Image may be NSFW.
Clik here to view.

Putting term vectors on a diet

What are term vectors?Term vectors are an interesting Lucene feature, which allows for retrieving a single-document inverted index for any document ID of your index. This means that given any document...

View Article

lz4-java 1.1.0 is out

I’m happy to announce the release of lz4-java 1.1.0. Artifacts can be downloaded from Maven Central and javadocs can be found at jpountz.github.com/lz4-java/1.1.0/docs/.Release highlightslz4 has been...

View Article


Versatile sorting

Sorted data sets are very useful since they make a lot of things easier:checking for duplicates,computing frequencies (the number of times each unique element appears),compression (thanks to delta...

View Article


lz4-java 1.3.0 is out

A new release of lz4-java is out and already available on Maven Central. As usual, documentation and benchmarks (compression, decompression and hashing) are published at...

View Article
Browsing latest articles
Browse All 10 View Live




Latest Images