13 April 2026

Tidbits from the World

Just quotes, etc. where I frequently want the original source… 

  1. “Legacy” is not a pejorative.

  2. “Invert, always, invert”

    1. Jacobi originally but popularized by Munger : “...it is in the nature of things that many hard problems are best solved when they are addressed backward”

    2. Algorithm for inversion:

    3. Define the problem - what is it that  you're trying to achieve?

    4. Invert it - what would guarantee the failure to achieve this outcome?

    5. Finally, consider solutions to avoid this failure

  3. “[Not naming software tools] is actually a lesson that originates from the early days of Call of Duty at IW. From what I’m told, IW never named their engine, because any time you name something it becomes a bit more of a thing, and you start working for it. And they were a studio making games, the game is the thing. Nothing else.” [source]

  4. “You have to do basic science on your libraries to see how they work…” [Sussman as reported by Wingo]

  5. “A pointer is an integer with a shiv” [source]

  6. Optimal stopping problems are solved with the odds algorithm, e.g. the  classical secretary problem.

  7. “The Statistics of Sharpe Ratios” by Andrew Lo [paper]

  8. “You are in a drawdown.  When should you start worrying?” [paper]

  9. NIST has several excellent references:

    1. Dictionary of Algorithms and Data Structures  (DADS), a dictionary of algorithms, algorithmic techniques, data structures, archetypal problems, etc

    2. Engineering Statistics Handbook with chapters Explore, Measure, Characterize, Model, Improve, Monitor, Compare, and Reliability

  10. “If you have a procedure with 10 parameters, you probably missed some.” [source]

  11. Driscoll Kraay Standard Errors are robust to both cross-sectional and temporal dependence.

  12. The  Python Graph Gallery is a visual reference over many graphs along with the source code to produce them.

  13. Total least squares  minimizes errors on both dependent and independent variables.

  14. Bottom Line Up Front (BLUF) writing [HBRWikipedia]

  15. Dennis Ritchie: “Our habit of trying to document bugs and limitations visibly was enormously useful to the system. As we put out each edition, the presence of these sections shamed us into fixing innumerable things rather than exhibiting them in public. I remember clearly adding or editing many of these sections, then saying to myself 'I can't write this,' and fixing the code instead.” [source]

  16. A bestiary of regression variants , including:

    1. Linear Regression

    2. Polynomial Regression

    3. Logistic Regression

    4. Quantile Regression

    5. Ridge Regression

    6. Lasso Regression

    7. Elastic Net Regression

    8. Principal Components Regression (PCR)

    9. Partial Least Squares (PLS) Regression

    10. Support Vector Regression

    11. Ordinal Regression

    12. Poisson Regression

    13. Negative Binomial Regression

    14. Quasi Poisson Regression

    15. Cox Regression

    16. Tobit Regression

  17. A Bestiary of Functions for Systems Designers  is a nice visual guide

  18. Karpathy blogged A Recipe for Training Neural Networks :

    1. Become one with the data

    2. Set up the end-to-end training/evaluation skeleton + get dumb baselines

    3. Overfit, recalling deep double descent

    4. Regularize

    5. Tune

    6. Squeeze out the juice

  19. Agustin Lebron’s The Laws of Trading :

    1. Know why you are doing a trade before you trade.

    2. You're never happy with the amount you traded.

    3. Take only the risks you're being paid to take.  Hedge the others.

    4. Put on a risk using the most liquid instrument for that risk.

    5. If you can't explain your edge in five minutes, you don't have a very good one.

    6. The long-term profitability of an edge is inversely proportional to how long it takes to explain it.

    7. The model expresses the edge.

    8. If you think your costs are negligible relative to your edge, you're wrong about at least one of them.

    9. Just because something has never happened doesn't mean it can't.

      Corollary: Enough people relying on something being true makes it false.

    10. Working to align everyone's interests is time well spent.

    11. If you don't master technology and data, you're losing to someone who does.

    12. If you're not getting better, you're getting worse.

  20. Shellhaters.org has a great POSIX Shell and Utilities Quick Reference

  21. Practical SQL for Data Analysis: What you can do without Pandas [source]

  22. Taguchi methods are statistical methods, sometimes called robust design methods, developed by Genichi Taguchi to improve the quality of manufactured goods, and more recently also applied to engineering…. See also orthogonal arrays , optimal experimental link, and notice Sandia’s Dakota implements many of these ideas.

  23. Prefer to change the code rather than write a workaround [source]

  24. By nature, bugs are found in places that you didn’t think to look [source]

    1. A cold path is a path through the code or situation that rarely happens.

    2. By contrast, hot paths happen frequently. You don’t find bugs in hot paths.

    3. Bugs are always in cold paths — every bug is found in a path colder than all the paths you tested.

    4. Don’t have cold paths and avoid fallbacks

  25. Your Makefiles are wrong is an opinionated approach to GNU Make:

    1. Don’t use tabs with .RECIPEPREFIX

    2. Always use (a recent) bash in strict mode

    3. Change some Make defaults, with a nice TLDR pre-amble

  26. Mike Acton’s Expectations of Professional Software Engineers [summary]

  27. Scalability! But at what COST? says “Big data systems may scale well, but this can often be just because they introduce a lot of overhead” and is often referred to as “The COST Paper”

  28. strace Wow Much Syscall by Brendan Gregg includes useful strace  one-liners

  29. Richard Sutton’s The Bitter Lesson starts “The biggest lesson that can be read from 70 years of AI research is that general methods that leverage computation are ultimately the most effective, and by a large margin….” and begins to conclude “One thing that should be learned from the bitter lesson is the great power of general purpose methods, of methods that continue to scale with increased computation even as the available computation becomes very great. The two methods that seem to scale arbitrarily in this way are search and learning.”

  30. Syntax across languages  is a gigantic survey of language design, e.g. operators/functions/etc. The survey can often inform what some collection of users will find natural or unnatural based on their existing skills.

  31. Nick Higham’s “What Is” article series  gives brief descriptions of concepts in numerical analysis for upper-level undergraduates with mathematical inclinations

  32. Long names are long observes “identifiers that are too damn long.”

    1. Omit words that are obvious given a variable’s or parameter’s type

    2. Omit words that don’t disambiguate the name

    3. Omit words that are known from the surrounding context

    4. Omit words that don’t mean much of anything

      (e.g. data, state, amount, value, manager, engine, object, entity, and instance)

    5. Ends with a humorous application to waffles.

  33. How to help someone use a computer : “Computer people are fine human beings, but they do a lot of harm in the ways they "help" other people with their computer problems….”

  34. Common statistical tests are linear models:

    1. Exposition in R

    2. Exposition in Python

  35. Hoeffding’s test for dependence was proposed by Wassily Hoeffding (1948) as a test of correlation for two variables with continuous distribution functions (wikipedia, random blogpost). Hoeffding’s D is a nonparametric measure of the distance between the joint distribution F(x, y) and the product of marginal distributions F1(x)F2(y). The advantage of this statistic lies in the fact that it has more power to detect non-monotonic dependency structures compared to other more common measures (Pearson, Kendall, Spearman)

  36. Data systems often fall into one of schema-on-write or schema-on-read:

    1. Traditional databases are schema-on-write: “we can't upload data until the table is created and we can't create tables until we understand the schema of the data that will be in this table.”

    2. Newer systems, e.g. “Big Data” is often schema-on-read, requiring all code that handles the data to understand it’s implicit schema.  Still “some level of schema design is inevitable”.

    3. Martin Kleppmann’s book Designing Data-Intensive Applications often comes up.

  37. Simple sabotage for software , taking CIA sabotage manuals for inspiration, enumerates many horrid, insidious ways to sabotage a software organization.  To do better as an organization, work to prevent even accidental sabotage.

  38. The Internet Was Designed With a Narrow Waist:

    1. A narrow waist is concept, interface, or protocol that solves an interoperability problem.

    2. Picture an hourglass with M things on one side, N on the other, and an important concept in the middle.

    3. It avoids O(M × N) code explosions, letting us write O(M + N) amounts of code instead.

    4. Byte streams and text are essential narrow waists in software, particularly in distributed-systems like the Internet.

  39. On achieving high-level goals [source pp 138-9]:

    1. Write down your five biggest goals for the year.

    2. Pick the one that will make the biggest difference in your life.  Tackle that one first.

    3. Write your chosen goal in the past tense as though it's already happened. This will not only tell you whether you truly want it by how you feel as you look back at it from the future, it will also help you identify (in imagined hindsight) all the critical steps it will take to get there.  Outline as many critical steps as come to mind.

    4. Write out twenty different things you could do right now to make it happen. Really stretch your thinking in new directions to get all twenty.  Now pick the one goal that will have the greatest impact on the achievement of that goal.  Try to knock one off your list each day, or week (depending on the goal).  Keep going with twenty more, and twenty more, until the goal is achieved.  Focus on only one goal at a time, putting all your energy in that one direction.  (Trying to work on them all at once can be overwhelming and diffuses your effectiveness and focus).

  40. Thompson sampling is commonly used as a multi-armed bandit algorithm. I have personally used it for an older side project.

  41. “Shipping is a social construct within a company.”

  42. There’s a CLAUDE.md containing Karpathy-inspired Claude Code guidelines.

  43. Every layer of review makes you 10x slower by he of apenwarr/redo fame.

 

04 February 2018

Randomized addition worksheets for kiddos using Bash Pipelines

Following up on a prior randomized worksheet for reading, here's a way to generate as many simple addition worksheets as you care to print:

#!/bin/bash
# Print a US letter page of addition problems using numbers 0 through 10.
(
    for i in $(seq 1 26); do
        printf "%2d %s %2d = \n" "$((RANDOM % 11))" "+" "$((RANDOM % 11))"
    done
) \
| column -nxc 36 \
| sed G \
| head -n -1 \
| a2ps -1 --chars-per-line=32 --no-header --borders=no

03 June 2017

Randomized Reading Lists for Kiddos using Bash Pipelines

Our kiddo is very good at memorizing stories and word sequences but needs to work on looking at the letters on the page. Simple bash pipelines can produce useful offline learning materials.

For example, print a randomized word list from inline story text for a book your kid loves:

#!/bin/bash
tr -d '",!.-' << EOF                          \
    | tr -s ' ' '\n'                          \
    | sed '/^$/d'                             \
    | sort -u                                 \
    | shuf                                    \
    | column -c 64                            \
    | nl                                      \
    | a2ps -1 --chars-per-line=70 --no-header

The night Max wore his wolf suit and made mischief of one kind
and another
his mother called him "WILD THING!"
...

EOF
Run the script once to produce the word list. Print the source once to have the picture-less story. Carry both in your pocket on the plane/train/etc. After some work on the word list, the kid should be encouraged by "rediscovering" the picture-less story he knows using the sight words he's just practiced. In theory. Still working on the practice.

Another example, print a randomized list of kindergarten sight words:

#!/bin/bash
sort -u <<EOF | shuf | column -c 64 | nl | a2ps -1 --chars-per-line=70 --no-header
a
about
after
again
all
always
am
an
and
any
are
as
ask
at
away
back
ball
be
beautiful
because
been
begin
best
big
but
by
came
can
come
could
couldn't
day
did
do
does
don't
down
each
easy
eat
either
enough
family
find
for
friend
from
fun
get
girl
go
goes
going
got
great
had
has
have
he
her
here
high
him
his
home
house
how
I
idea
if
I'm
in
into
is
it
jump
just
know
last
let
like
little
look
love
make
man
me
might
mom
more
mother
much
my
never
next
no
not
now
of
often
on
or
our
out
over
play
pretty
probably
put
ran
read
ready
run
said
same
sat
saw
say
school
see
she
should
sit
so
soon
special
such
suddenly
take
than
that
the
their
them
themselves
then
there
they
they're
things
think
this
thought
three
through
to
today
together
too
two
under
until
up
us
very
wait
walk
want
was
we
went
were
what
when
where
while
who
will
with
without
yes
you
your
you're
yourself
EOF

11 December 2016

Travis-CI config for both recent(ish) gcc and clang

On account of the age of Travis-CI's build images (I hear) getting a new-ish C++ compiler going is futzy. After much futzing, the following .travis.yml file works for an autoconfiscated project:

language: generic
script: ./bootstrap && ./configure && make all && make check && make distcheck
matrix:
  include:
    - os: linux
      env: COMPILER_NAME=gcc CXX=g++-5 CC=gcc-5
      addons:
        apt:
          sources:
            - ubuntu-toolchain-r-test
          packages:
            - autotools-dev
            - g++-5
    - os: linux
      env: COMPILER_NAME=clang CXX=clang++-3.8 CC=clang-3.8
      addons:
        apt:
          sources:
            - ubuntu-toolchain-r-test
            - llvm-toolchain-precise-3.8
          packages:
            - autotools-dev
            - clang-3.8

Subscribe Subscribe to The Return of Agent Zlerich