summaryrefslogtreecommitdiffstats
path: root/proposals/2020_py37_refactors.md
diff options
context:
space:
mode:
authorBryan Newbold <bnewbold@robocracy.org>2020-01-03 16:05:07 -0800
committerBryan Newbold <bnewbold@robocracy.org>2020-01-03 16:05:07 -0800
commit3a57c35ddcf794d7211d1649e74a9917bd1c9495 (patch)
tree48f0c420d2048eeeec4add0b4523cb6a0a14dcfe /proposals/2020_py37_refactors.md
parentd5e2af24cb6563eca91407425cea9b808a7d691c (diff)
downloadfatcat-3a57c35ddcf794d7211d1649e74a9917bd1c9495.tar.gz
fatcat-3a57c35ddcf794d7211d1649e74a9917bd1c9495.zip
proposals: standardize a bit
Diffstat (limited to 'proposals/2020_py37_refactors.md')
-rw-r--r--proposals/2020_py37_refactors.md101
1 files changed, 0 insertions, 101 deletions
diff --git a/proposals/2020_py37_refactors.md b/proposals/2020_py37_refactors.md
deleted file mode 100644
index f0321b33..00000000
--- a/proposals/2020_py37_refactors.md
+++ /dev/null
@@ -1,101 +0,0 @@
-
-status: planning
-
-If we update fatcat python code to python3.7, what code refactoring changes can
-we make? We currently use/require python3.5.
-
-Nice features in python3 I know of are:
-
-- dataclasses (python3.7)
-- async/await (mature in python3.7?)
-- type annotations (python3.5)
-- format strings (python3.6)
-- walrus assignment (python3.8)
-
-Not sure if the walrus operator is worth jumping all the way to python3.8.
-
-While we might be at it, what other superficial factorings might we want to do?
-
-- strict lint style (eg, maximum column width) with `black` (python3.6)
-- logging/debugging/verbose
-- type annotations and checking
-- use named dicts or structs in place of dicts
-
-## Linux Distro Support
-
-The default python version shipped by current and planned linux releases are:
-
-- ubuntu xenial 16.04 LTS: python3.5
-- ubuntu bionic 18.04 LTS: python3.6
-- ubuntu focal 20.04 LTS: python3.8 (planned)
-- debian buster 10 2019: python3.7
-
-Python 3.7 is the default in debian buster (10).
-
-There are apt PPA package repositories that allow backporting newer pythons to
-older releases. As far as I know this is safe and doesn't override any system
-usage if we are careful not to set the defaults (aka, `python3` command should
-be the older version unless inside a virtualenv).
-
-It would also be possible to use `pyenv` to have `virtualenv`s with custom
-python versions. We should probably do that for OS X and/or windows support if
-we wanted those. But having a system package is probably a lot faster to
-install.
-
-## Dataclasses
-
-`dataclasses` are a user-friendly way to create struct-like objects. They are
-pretty similar to the existing `namedtuple`, but can be mutable and have
-methods attached to them (they are just classes), plus several other usability
-improvements.
-
-Most places we are throwing around dicts with structure we could be using
-dataclasses instead. There are some instances of this in fatcat, but many more
-in sandcrawler.
-
-## Async/Await
-
-Where might we actually use async/await? I think more in sandcrawler than in
-the python tools or web apps. The GROBID, ingest, and ML workers in particular
-should be async over batches, as should all fetches from CDX/wayback.
-
-Some of the kafka workers *could* be aync, but i'm not sure how much speedup
-there would actually be. For example, the entity updates worker could fetch
-entities for an editgroup concurrently.
-
-Inserts (importers) should probably mostly happen serially, at least the kafka
-importers, one editgroup at a time, so progress is correctly recorded in kafka.
-Parallelization should probably happen at the partition level; would need to
-think through whether async would actually help with code simplicity vs. thread
-or process parallelization.
-
-## Type Annotations
-
-The meta-goals of (gradual) type annotations would be catching more bugs at
-development time, and having code be more self-documenting and easier to
-understand.
-
-The two big wins I see with type annotation would be having annotations
-auto-generated for the openapi classes and API calls, and to make string
-munging in importer code less buggy.
-
-## Format Strings
-
-Eg, replace code like:
-
- "There are {} out of {} objects".format(found, total)
-
-With:
-
- f"There are {found} out of {total} objects"
-
-## Walrus Operator
-
-New operator allows checking and assignment together:
-
- if (n := len(a)) > 10:
- print(f"List is too long ({n} elements, expected <= 10)")
-
-I feel like we would actually use this pattern *a ton* in importer code, where
-we do a lot of lookups or cleaning then check if we got a `None`.
-