Tidbits from the World
Just quotes, etc. where I frequently want the original source…
“Invert, always, invert”
Jacobi originally but popularized by Munger : “...it is in the nature of things that many hard problems are best solved when they are addressed backward”
Algorithm for inversion:
Define the problem - what is it that you're trying to achieve?
Invert it - what would guarantee the failure to achieve this outcome?
Finally, consider solutions to avoid this failure
“[Not naming software tools] is actually a lesson that originates from the early days of Call of Duty at IW. From what I’m told, IW never named their engine, because any time you name something it becomes a bit more of a thing, and you start working for it. And they were a studio making games, the game is the thing. Nothing else.” [source]
“You have to do basic science on your libraries to see how they work…” [Sussman as reported by Wingo]
“A pointer is an integer with a shiv” [source]
Optimal stopping problems are solved with the odds algorithm, e.g. the classical secretary problem.
“The Statistics of Sharpe Ratios” by Andrew Lo [paper]
“You are in a drawdown. When should you start worrying?” [paper]
NIST has several excellent references:
Dictionary of Algorithms and Data Structures (DADS), a dictionary of algorithms, algorithmic techniques, data structures, archetypal problems, etc
Engineering Statistics Handbook with chapters Explore, Measure, Characterize, Model, Improve, Monitor, Compare, and Reliability
“If you have a procedure with 10 parameters, you probably missed some.” [source]
Driscoll Kraay Standard Errors are robust to both cross-sectional and temporal dependence.
The Python Graph Gallery is a visual reference over many graphs along with the source code to produce them.
Total least squares minimizes errors on both dependent and independent variables.
Dennis Ritchie: “Our habit of trying to document bugs and limitations visibly was enormously useful to the system. As we put out each edition, the presence of these sections shamed us into fixing innumerable things rather than exhibiting them in public. I remember clearly adding or editing many of these sections, then saying to myself 'I can't write this,' and fixing the code instead.” [source]
A bestiary of regression variants , including:
Linear Regression
Polynomial Regression
Logistic Regression
Quantile Regression
Ridge Regression
Lasso Regression
Elastic Net Regression
Principal Components Regression (PCR)
Partial Least Squares (PLS) Regression
Support Vector Regression
Ordinal Regression
Poisson Regression
Negative Binomial Regression
Quasi Poisson Regression
Cox Regression
Tobit Regression
A Bestiary of Functions for Systems Designers is a nice visual guide
Karpathy blogged A Recipe for Training Neural Networks :
Become one with the data
Set up the end-to-end training/evaluation skeleton + get dumb baselines
Overfit, recalling deep double descent
Regularize
Tune
Squeeze out the juice
Agustin Lebron’s The Laws of Trading :
Know why you are doing a trade before you trade.
You're never happy with the amount you traded.
Take only the risks you're being paid to take. Hedge the others.
Put on a risk using the most liquid instrument for that risk.
If you can't explain your edge in five minutes, you don't have a very good one.
The long-term profitability of an edge is inversely proportional to how long it takes to explain it.
The model expresses the edge.
If you think your costs are negligible relative to your edge, you're wrong about at least one of them.
Just because something has never happened doesn't mean it can't.
Corollary: Enough people relying on something being true makes it false.Working to align everyone's interests is time well spent.
If you don't master technology and data, you're losing to someone who does.
If you're not getting better, you're getting worse.
Shellhaters.org has a great POSIX Shell and Utilities Quick Reference
Practical SQL for Data Analysis: What you can do without Pandas [source]
Taguchi methods are statistical methods, sometimes called robust design methods, developed by Genichi Taguchi to improve the quality of manufactured goods, and more recently also applied to engineering…. See also orthogonal arrays , optimal experimental link, and notice Sandia’s Dakota implements many of these ideas.
Prefer to change the code rather than write a workaround [source]
By nature, bugs are found in places that you didn’t think to look [source]
A cold path is a path through the code or situation that rarely happens.
By contrast, hot paths happen frequently. You don’t find bugs in hot paths.
Bugs are always in cold paths — every bug is found in a path colder than all the paths you tested.
Don’t have cold paths and avoid fallbacks
Your Makefiles are wrong is an opinionated approach to GNU Make:
Don’t use tabs with .RECIPEPREFIX
Always use (a recent) bash in strict mode
Change some Make defaults, with a nice TLDR pre-amble
Mike Acton’s Expectations of Professional Software Engineers [summary]
Scalability! But at what COST? says “Big data systems may scale well, but this can often be just because they introduce a lot of overhead” and is often referred to as “The COST Paper”
strace Wow Much Syscall by Brendan Gregg includes useful strace one-liners
Richard Sutton’s The Bitter Lesson starts “The biggest lesson that can be read from 70 years of AI research is that general methods that leverage computation are ultimately the most effective, and by a large margin….” and begins to conclude “One thing that should be learned from the bitter lesson is the great power of general purpose methods, of methods that continue to scale with increased computation even as the available computation becomes very great. The two methods that seem to scale arbitrarily in this way are search and learning.”
Syntax across languages is a gigantic survey of language design, e.g. operators/functions/etc. The survey can often inform what some collection of users will find natural or unnatural based on their existing skills.
Nick Higham’s “What Is” article series gives brief descriptions of concepts in numerical analysis for upper-level undergraduates with mathematical inclinations
Long names are long observes “identifiers that are too damn long.”
Omit words that are obvious given a variable’s or parameter’s type
Omit words that don’t disambiguate the name
Omit words that are known from the surrounding context
Omit words that don’t mean much of anything
(e.g. data, state, amount, value, manager, engine, object, entity, and instance)Ends with a humorous application to waffles.
How to help someone use a computer : “Computer people are fine human beings, but they do a lot of harm in the ways they "help" other people with their computer problems….”
Common statistical tests are linear models:
Hoeffding’s test for dependence was proposed by Wassily Hoeffding (1948) as a test of correlation for two variables with continuous distribution functions (wikipedia, random blog, post). Hoeffding’s D is a nonparametric measure of the distance between the joint distribution F(x, y) and the product of marginal distributions F1(x)F2(y). The advantage of this statistic lies in the fact that it has more power to detect non-monotonic dependency structures compared to other more common measures (Pearson, Kendall, Spearman)
Data systems often fall into one of schema-on-write or schema-on-read:
Traditional databases are schema-on-write: “we can't upload data until the table is created and we can't create tables until we understand the schema of the data that will be in this table.”
Newer systems, e.g. “Big Data” is often schema-on-read, requiring all code that handles the data to understand it’s implicit schema. Still “some level of schema design is inevitable”.
Martin Kleppmann’s book Designing Data-Intensive Applications often comes up.
Simple sabotage for software , taking CIA sabotage manuals for inspiration, enumerates many horrid, insidious ways to sabotage a software organization. To do better as an organization, work to prevent even accidental sabotage.
A narrow waist is concept, interface, or protocol that solves an interoperability problem.
Picture an hourglass with M things on one side, N on the other, and an important concept in the middle.
It avoids O(M × N) code explosions, letting us write O(M + N) amounts of code instead.
Byte streams and text are essential narrow waists in software, particularly in distributed-systems like the Internet.
On achieving high-level goals [source pp 138-9]:
Write down your five biggest goals for the year.
Pick the one that will make the biggest difference in your life. Tackle that one first.
Write your chosen goal in the past tense as though it's already happened. This will not only tell you whether you truly want it by how you feel as you look back at it from the future, it will also help you identify (in imagined hindsight) all the critical steps it will take to get there. Outline as many critical steps as come to mind.
Write out twenty different things you could do right now to make it happen. Really stretch your thinking in new directions to get all twenty. Now pick the one goal that will have the greatest impact on the achievement of that goal. Try to knock one off your list each day, or week (depending on the goal). Keep going with twenty more, and twenty more, until the goal is achieved. Focus on only one goal at a time, putting all your energy in that one direction. (Trying to work on them all at once can be overwhelming and diffuses your effectiveness and focus).
Thompson sampling is commonly used as a multi-armed bandit algorithm. I have personally used it for an older side project.
There’s a CLAUDE.md containing Karpathy-inspired Claude Code guidelines.
Every layer of review makes you 10x slower by he of apenwarr/redo fame.
