execute SQL-like queries on CSV and TSV files
What is CSV and TSV ? ๐
CSV stands for Comma-Separated Values. It is a plain-text file format.
TSV stands for Tab-Separated Values. It is a plain-text file format.
CSV is more common than TSV as a plain-text dataset format. Both are usually used for light weight data exchange. They can be used as a spreadsheet formats; as they are compatible with Microsoft Excel format (.xls, .xlsx) and LibreOffice Calc format (.ods).
What is q ? ๐
q is a tool to execute SQL-like queries on CSV and TSV files.
More information: https://harelba.github.io/q .
Usage ๐
Query a CSV file by specifying the delimiter as ‘,’:
q [-d|--delimiter] ',' "SELECT * from path/to/file"
for example:
q -d ',' "SELECT * from path/to/file"
Query a TSV file:
q [-t|--tab-delimited] "SELECT * from path/to/file"
for example:
q -t "SELECT * from path/to/file"
Query file with header row:
q [-d|--delimiter] delimiter [-H|--skip-header] "SELECT * from path/to/file"
for example:
q -d ',' -H "SELECT * from path/to/file"
Read data from stdin ; ‘-’ in the query represents the data from stdin:
output | q "select * from -"
Join two files (aliased as f1
and f2
in the example) on column c1
, a common column:
q "SELECT * FROM path/to/file f1 JOIN path/to/other_file f2 ON (f1.c1 = f2.c1)"
Format output using an output delimiter with an output header line (Note: command will output column names based on the input file header or the column aliases overridden in the query):
q [-D|--output-delimiter] delimiter [-O|--output-header] "SELECT column as alias from path/to/file"
example:
q -D ',' -O "SELECT column as alias from path/to/file"
I hope you enjoyed reading this post as much as I enjoyed writing it. If you know a person who can benefit from this information, send them a link of this post. If you want to get notified about new posts, follow me on YouTube , Twitter (x) , LinkedIn , and GitHub .