A simple CLI tool for efficient sampling of lines from large files.
Lines are selected randomly and by default, the output may contain the same line multiple times (i.e. sampling with replacement).
Usage:
quicklines -c HOW_MANY_LINES my-huge-file.txtquicklines will return the requested number of lines from random positions in the input file.
Optionally, you can sample without replacement by using --no-duplicates.
Be careful with this option, if you ask for a sample which is too large, this may cause the program
to run forever (or for a very long time).
The implementation relies on mmap to work efficiently.