I wrote the first draft of this in roughly 2017 but left lots of details out. I happened to run across this again about 2 weeks ago and it now seems quite apropos for the time we’re living it, so I spruced it up a bit, added some of the missing details, and here I’m publishing it. Enjoy!
I know it has to be true – The magic of human-level cognition isn’t the result of a brain in which every single piece is perfectly tuned for its very special and isolated task. Instead, there has to be some simple “principle of computation” that is repeated throughout the brain; some component that, when replicated throughout the brain, gives rise to this emergent property we call intelligence. Sure, we have a motor cortex, specialized to control movement. We have a visual cortex, specialized to process information from our eyes. And we have a pre-frontal cortex that assimilates all this information together and plans what to do next – “That’s a snake! Don’t step on it!” But there is evidence that the base circuitry that makes up all these modules is actually quite general. At every point on the neocortex, you see basically the same pattern - an interwoven computational “component” composed of 6 layers of stacked neurons. At every portion of our cortex this pattern is, with very little modification, repeated. Besides the similar morphology, there is other evidence that the computational components are general. In several really weird brain rewiring studies, they have redirected visual input to the auditory pathway and shown that animals can compensate quite well - effectively “seeing” with the part of their brain that was rightfully intended to hear stuff! (Mice (Lyckman et al., 2001), ferrets (Sur et al., 1988; Roe et al., 1990; Roe et al., 1992; Roe et al., 1993; Sharma et al., 2000), and hamsters (Schneider, 1973; Kalil & Schneider, 1975; Frost, 1982; Frost & Metin, 1985).)
Over my half-career in software development, I’ve started to collect some insights (or at least opinions) about how software can be built so that it is easy to maintain, use, and extend. Usually we hear of principles such as modularity, abstraction, loose coupling, and separation of concerns, and each of these is important to strive for. But I’ve found that behind all of these, there is a single, unifying principle – the reduction of cognitive load. In this post I talk about what I’ve come to think of as the primary objective of software design: minimizing total cognitive load of all future users and maintainers of your software.
Vector search has taken the world by storm. The idea is this - cast documents into a vector embedding space via some transformation (e.g. BERT) and then index them into a vector database that is capable of fast approximate nearest neighbors search. Then when the user makes a text query, cast their query into this same embedding space and find the nearest vectors to the query vector and return the corresponding documents.
I realized today that I don’t write blog posts most of the time. If you were to look at my private notes, for every blog post that I’ve published here, I have probaby 10-20 half baked ideas that would be great to write about… but I just don’t have time to get around to them. Why not? Well, because frankly, writing a blog post is a time risk. If I determine myself to write a post on some Friday afternoon, it’s very possible that I could be signing up for a whole-weekend work task, or even more if it’s really a good post. Some good posts require research and creating proofs. All good posts require lots of time for actually crafting the text and refining it. What’s more, blogging is a rather lonely task. All this preparation I do alone, and that takes me away from time I could be spending around people – which I much prefer.
(Note to reader: I think I wrote this post for myself. From an outside perspective, it’s by far the most boring one I’ve ever written. But it’s math that’s been occupying my mind for a week and from an inside perspective it’s been quite fun. Maybe you’ll find the fun in it that I did.)
I’ve been playing with my Twitter social graph recently, and it occurred to me that the people that I’m friends with form several clusters. I wanted to see if I could come up with some sort of clustering algorithm to identify these clusters. Why? Well for one, it could be of practical use; maybe I can find some good use for it. But, perhaps more than that, I was curious if I could make a clustering algorithm – I’ve kinda got a thing for reinventing wheels.
I like positing hypotheses that are completely unverified and poorly examined. Why? Because it’s easier to play with ideas when you don’t have to check your work. 🤣 Here are two somewhat related hypotheses about how evolution has made two very different jiggly things more durable and resistant to distress: your brain and trees.
You’ve got a problem: a small subset of abusive users are body slamming your API with extremely high request rates. You’ve added windowed rate limiting, and this reduces the load on your infrastructure, but behavior persists. These naughty users are not attempting to rate-limit their own requests. They fire off as many requests as they can, almost immediately hit HTTP 429 Too Many Requests, and even then don’t let up. As soon as a new rate limit window is available, the pattern starts all over again.
In the companion post I introduced a problem with naive, window-based rate limiters – they’re too forgiving! The user’s request count is stored in a key in Redis with a TTL of, say, 15 minutes, and once the key expires, the abusive user can come back and immediately offend again. Effectively the abusive user is using your infrastructure to rate limit their requests.
A while back I built a PEG (Parsing Expression Grammer) parser in golang. I wasn’t blogging at the time, so the idea slipped under the radar. Here’s a link to the codebase.
About three and a half years ago I came up with a clever trick for accurately approximating and tightly bounding a cumulative distribution in a rather small data structure. It’s high time that I blogged about it! In this post I’ll talk about the problem space, my technique, the potential benefits of my approach over other approaches, and ways to improve in the future.
It has gnawed on my subconscious for the past 5 years. Even as I wrote Relevant Search it was there at the back of my mind weighing me down - the fundamental problem of search. But only now has the problem taken shape so that I can even begin to describe it. Succinctly, here it is:
On April 10th and 11th OpenSource Connections held their first (annual I hope) Haystack search relevance conference. It was intended to be a small-and-casual, 50-person conference but ended up pulling in roughly 120 people requiring OSC to scramble to find more space. The end result was one of the best conferences I’ve ever attended. In general, conference speakers have to aim their content at the lowest common denominator so they that they don’t lose their audience. At this conference, the lowest common denominator was really high! So there was no need to over-explain the boring introductory topics. Instead the speakers were able to jump into interesting and deep content immediately.
Today I had a Penny Chat with Will Acuff discussing how organizations can form relationships with communities. Will should know, he and his wife Tiffany founded Corner toCorner a group that made huge inroads into helping underprivileged communities in Nashville. The reason that I want to learn about this is because my church, (New Garden Church), is making a concerted effort right now to better connect to our community. In some ways we are positioned perfectly to do this - our church services are in Dupont Tyler Middle School. However we have yet to make meaningful relationships with the people in our community outside of our congregation. So we’re looking for help!
Click tracking is a way of boosting documents based upon the historical clickthrough rate that they received when surfaced in search results. Here’s how it works: Let’s say that we’re building click tracking for an online store and we want to boost the documents that are getting the most attention. First you set up logging so that you can count how times a particular item is clicked. Next you have a process that aggregates the clicks across, say, a week, and you store the value in a click_count field along side the documents that you are serving from search. Finally, when someone performs a search you boost the results according to the click_count so that items with high clickthrough rates start surfacing higher in search results. But if you think hard, there’s a pretty nasty problem with this approach.
Embedding spaces are quite trendy right now machine learning. With word2vec for example, you can create an embedding for words that is capable of capturing analogy. Given the input “man is to king as woman is to what?”, a word2vec embedding can be used to correctly answer “queen”. (Remarkable isn’t it?) Embeddings like this can be used for a wide variety of different domains. For example, facial photos can be projected into an embedding space and for tasks of facial recognition. However I wonder if embeddings fall short in a domain that I am very near to - search. Consider the facial recognition task: Each face photo is converted into an N-dimensional vector where N is often rather high (hundreds of values). Given a sample photograph of a face, if you want to find all of the photos of that person then you have to search for all the photo vectors near to the sample photo’s vector. But, due to the curse of dimensionality, very high dimensional embedding spaces are not amenable to data structure commonly used for spatial search, such as k-d trees.
As many of my friends know, I’ve picked up neuroscience as a sort of side hobby. (Some people collect stamps, I memorize anatomical structures of the brain.) Last time I blogged about this was regarding my Penny Chat with Stephen Bailey on his work with MRIs. But this week I sat down with one of Stephen’s friends David Simon to talk about his research involving Electroencephalography a.k.a. EEG.
A week ago I met with an aspiring entrepreneur who had some interesting ideas regarding a recruitement startup. But during the conversation I got the feeling that he was holding his cards close and I was having a little trouble getting the whole picture. Towards the end of the conversation he confided that he was really vested in his ideas for the startup and that it actually hurt to hear those ideas criticized.
I was lucky enough last week to find myself drinking a beer with Pat Poels, Eventbrite VP of engineering and two-time World Series of Poker bracelet winner. And I was luckier still that he was in the mood to talk about his poker days. I love hearing these stories but I’m always reluctant to ask because I suspect people ask him about “the poker days” all the time.
As a developer, my understanding and respect for software testing has been slow coming because in my previous work I have been an engineer and a consultant, and in these roles it wasn’t yet obvious how important testing really is. But over the past year I have finally gained an appropriate respect and appreciation for testing; and it’s even improving the way I write code. In this post I will explain where I’ve come from and how far I’ve traveled in my testing practices. I’ll then list out some of the more important principles I’ve picked up along the way.
In his 1943 paper, A Theory of Human Motivation, Abraham Maslow introduced a simple principle that has had a profound influence in the fields of psychology and sociology. Namely, he introduce the concept of a hierarchy of human needs which he termed Physiological, Safety, Belongingness and Love, Esteem, Self-Actualization and Self-Transcendence. And Maslow’s big main point here was that it is necessary to first satisfy the basic needs before we can even have the luxury to start worrying about the higher-level concerns. But for me, looking through Maslow’s hierarchy in some detail, it seems that all the cool kids hang out towards the top of that hierarchy. I’ve been there in the past, and am occasionally so fortunate as to touch the top of the hierarchy again from time to time. But I think that I (we) can do better than this! So I determined myself to try and devise a way to “hack” Maslow’s Hierarchy so as to maximize the time I’m spending near the top.
If you play around much with graphs, one of the first things that you’ll run into is the idea of network centrality. Centrality is a value associated with a node which represents how important and how central that node is to the network as a whole. There are actually quite a few way of defining network centrality - here are just a few:
This is the first post of what I hope will be many posts to come. Being the first, I feel that it is import to lay out the themes that I intend to cover and the goals that I expect to achieve. However – having not the slightest idea of what I will do with this blog, or even if I’ll do anything with it at all, you’ll have to be satisfied with the pretentious sounding introductory material which you are currently reading.