aboutsummaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorBryan Newbold <bnewbold@archive.org>2022-06-24 19:53:18 -0700
committerBryan Newbold <bnewbold@archive.org>2022-06-24 19:53:18 -0700
commit0ac571f41a2a0be84dbd360805e3c4e51686681d (patch)
tree4fcefec133205f12b01141bc4e2fced9d2bf6173
parent31f1d3d18566d1ae97daaedbd17132a11a7ab5aa (diff)
downloadaft-0ac571f41a2a0be84dbd360805e3c4e51686681d.tar.gz
aft-0ac571f41a2a0be84dbd360805e3c4e51686681d.zip
commit old notes
-rw-r--r--notes/background.txt10
-rw-r--r--notes/plan.txt58
2 files changed, 68 insertions, 0 deletions
diff --git a/notes/background.txt b/notes/background.txt
new file mode 100644
index 0000000..9449861
--- /dev/null
+++ b/notes/background.txt
@@ -0,0 +1,10 @@
+
+## Libraries
+
+- [tablib](http://docs.python-tablib.org/en/latest/)
+- [records](https://github.com/kennethreitz-archive/records)
+
+## File Formats
+
+- column stores like parquet, arrow
+- data packages
diff --git a/notes/plan.txt b/notes/plan.txt
new file mode 100644
index 0000000..8f09fc0
--- /dev/null
+++ b/notes/plan.txt
@@ -0,0 +1,58 @@
+
+x write basic exmaple from TSV file
+x pass-through basic thing in pipeline
+x pretty printer (using column writing, term color)
+- convert
+ to/from TSV
+ to/from JSON
+- example apps
+ ls
+ stat
+ df, mount, something like that
+ trivial web server (or other thing that logs)
+- example datasets (eg, for benchmarking): compare TSV, AFT, JSON
+ million-line CDX
+ log file
+- manpage (?)
+- build .deb and installable
+- helper library
+ => R/W trails/wrappers
+ => header struct
+ => iterate rows (from input)
+ => pretty-print output based on tty status
+ => validation/check modes
+ => stream mode/helper for subprocesses
+- tests
+- compare with xsv command (?)
+- reimplement basic commands
+ cut (accept field names)
+ cat (combining files with compatible headers)
+ head, tail
+ wc (count rows, records)
+ format (accept field names)
+ grep/match/filter by column value?
+ paste
+ uniq (by column)
+ sort (by column)
+ join
+
+- extended commands
+ parallel (with column names)
+ shuf
+ comm
+ expand/unexpand
+ nl
+ seq
+
+ideas:
+- python stuff
+- C stuff
+- log log format integration
+- rust serde integration
+- aft-header: pretty-prints header as rows
+- aft-single: pretty-print single row (first?) as rows
+- aft-format (or printf?) "this {col1} to that {col2}" "some other column"
+- aft2json, json2aft
+- aft2html
+- aft-stats: sum, mean, stddev, min, max
+- aft-sql