Nat’s 2022 Technical Link Pile: Dev, Architecture, APIs

December 30, 2022 – 7:13 pm

See the Intro for context.

[20221231] Temporal.ioWith Temporal, you write your various complex, potentially long-running actions as normal, synchronous code, which Temporal “magically” turns into distributed, asynchronous, automatically retried, idempotent jobs. The design is really elegant, and removes all of the hard work from writing code that must keep things in sync and handle failures through queueing and retrying. (as described in this Hacker News comment) Also: Temporal jobs/tasks are called workflows because the code is effectively translated into workflow steps—i.e. code progress is made by workers, and each step is persisted so that if the worker dies, you don’t lose anything—another worker picks up in the exact same place in the code, with all local variables and threads intact. It also provides helpful built-in features like retries and timeouts of anything that might fail, like network requests, and the ability to “sleep” for arbitrary periods without consuming threads/resources (other than a timer in a database that is automatically set up for you when you call `sleep(‘1 month’)`).

[20221231] ULIDs and Primary Keys — omg there’s an RFC (4122) for UUIDs. There are variants. Author of this blog post likes ULIDs. They contain more randomness and a straightforward structure at the cost of not explicitly exposing the version, variant, or monotonic counter. UUIDs are also the perfect choice if you’re writing some apocalypse-scenario software (assuming you have a working computer) as they can continue to be generated until 10,889AD compared to UUIDv7s measly 4,147AD death date.

[20221223] How Pitfall Builds Its WorldCrane used a single byte to represent the layout of the current room.

[20221223] Docker Without Containers — wasm managed with the same tools as Docker containers.

[20221221] Taming Names in Software Developmentbalancing short v long, conventional v distinctive; make your own pattern for names (eg adjective_noun_unabbreviatedUnits giving minimum_messageLength_bytes). A good read.

[20221221] You Might Not Need a CRDTbrowsers are inherently not peer-to-peer. To run an application from the web, you connect to a server. Since the server is centralized anyway, we can have it enforce a global ordering over the events. That is, every replica receives events in the same orer. With this, we can sidestep the need to pay the CRDT complexity tax.

[20221221] Abstractions are ExpensiveAt companies with huge engineering forces, abstraction management is what a lot of them spend their time on. Often, these engineers are actually the most “productive” in terms of money saved – infrastructure projects tend to result in 8-9 figure savings or unique capabilities, and performance engineering (another form of abstraction alignment) frequently has 8 figure returns per engineer. Another large group of engineers is in charge of making sure that the old abstractions don’t break and crash the entire system.

[20221221] Webhook Architecture Design — awful verbatim transcription of someone with a lot of repetition and “mm hmm”, but the content is good.

[20221221] Event-Driven APIs with Webhook and API Gateway — part II of Building event-driven API services using CQRS, API Gateway and Serverless.

[20221211] GitPodSpin up fresh, automated dev environments for each task, in the cloud, in seconds.

[20221210] Algorithms I PrototypedCRDT Fractional Indexing; CRDT Tree-Based Indexing; CRDT: Mutable Tree Hierarchy: Sync a tree hierarchy between peers. Nodes in the tree retain their identity even if they are reparented somewhere else in the tree; Logarithmically-Spaced Snapshots: Keep periodic snapshots of something, but keep more snapshots of recent events and fewer snapshots of long-ago events.

[20221210] Server Sent Events for Shopify Dashboard – Spotify’s Black Friday dashboard built with SSE

[20221126] Domain Driven Design Intro — very gentle high-level introduction that makes sense to me. Common vocab used in requirements, implementation, tests, bug reports, etc. And an explication of entities (which have a lifetime) vs values (which just exist as a number or string), and aggregate entities (e.g., an order + orderlines, where orderlines have no meaning outside an order). DDD appears to full of guidelines like: consistency rules apply inside an aggregate, an aggregate has a single external identifier, etc.

[20221126] StackStormevent-driven automation for auto-remediation, incident responses, troubleshooting, deployments, and more for DevOps and SREs. Includes rules engine, workflow, 160 integration packs with 6000+ actions and ChatOps. If This Then That for Ops. (I’m interested in every workflow engine)

[20221121] You Will Never Fix It Later – outlines the reasons why the probability a hacky solution will be fixed decreases with its age. For example: The further removed you are from the point where the code was written, the less you understand it. You remember less about what it does, what it’s supposed to do, how it does it, where it’s used, etc. If you don’t understand all its intended use cases, then you’re not confident that you can test all of them. Which means you’re worried that any change you make might break some use case you were unaware of. (yes, good tests help, but how many of us trust their test suites even when we’re not very familiar with the code?)

[20221121] The Modern Observability Problem – pithy summary. To resolve problems in a large microservice system: We must be able to find out why they have occurred; For this, our system must be observable; To be observable, a system needs to be instrumented such that the code emits telemetry, which is typically logs, traces and metrics; This telemetry must be sent to a backend that supports joining up that telemetry data and answering questions about the system’s behaviour.

[20221119] The Perfect Commita single commit that contains all of the following: The implementation: a single, focused change; Tests that demonstrate the implementation works; Updated documentation reflecting the change; A link to an issue thread providing further context.

[20221119] Build a Modular Monolith First — what it says on the box. [I]t is all about breaking up the system into modules, and then combining those modules into a monolith for deployment.

[20221107] Functional Testing — exercise public endpoints without mocking.

[20221107] Structured Errors in HTTP APIs — RFC-7807 provides a standardised error structure.

[20221107] Recovering From Crashes with Safe Mode — when a feature flag causes your program to crash, add a safe-mode to detect the crashes and rollback to a safe feature-flag state.

[20221107] Code Like an SRE — it’s like getting an SRE drunk and saying “tell me all the stuff you learned the hard way”.

[20221107] Content-Defined Chunking — very readable explanation of how to incrementally backup or synchronise files without transferring data you don’t have to.

[20221007] Mobile Developer Experience at Slack — putting dollar values on lousy build experiences is clever.

[20221107] Demystifying Software Architecture Patterns — commonalities across Clean, Hexagonal, and Onion architectures: centralized business rules; application specific rules (operation-specific orchestration and related logic); source code dependencies should only point inward; isolation of layers. Core business rules should be independent of how you persist it, how you expose it, and which framework you’re using.

[20221102] Git Bisect with Automated Tests — use a new test to zero in on the commit that failed.

[20221029] Local First — developing with CRDTs and the expectation that the local machine is always true, and the truth will be reconciled with others when connected again.

[20221008] Fundamentals of Control Theory – free ebook. See also Brian Douglas’s videos on YouTube, Brian Douglas’s MatLab videos on Control Theory, Steve Brunton’s videos, and Christopher Lum’s videos, and Kat Kim’s videos.

[20221008] GruCloud – Generate Javascript code from live infrastructures; Deploy, destroy and list resources on various clouds; Share and compose infrastructure; Automatic resource dependencies management.

[20221003] Lessons from Deploying Microservices for a Large Retailer – no containers, NoSQL, or even the cloud.

[20221003] How we Store and Process Millions of Orders a Day – Grab’s architecture.

[20220922] How to Deal with Money in Software – overview of possible representations, better recommendations, and an addendum of tests.

[20220922] Webhooks.fyi – a directory of webhook providers and a collection of best practices for providing and consuming webhooks.

[20220922] Reducing Duplicate Code in our Applications Using HATEOAS – includes a nice explanation of Hypertext As The Engine of Application State.

[20220805] Bad Microservices — modularise your monolith.

[20220804] Why You Can’t Guarantee Webhook Ordering — the discussion was interesting too. Polling, queueing, events, and call-for-response all have their advocates.

[20220803] Use One Big Server — frequent topic of discussion. One database server (or cluster that lets you separate reads from writes) can get you a long way, and the hardware cost of vertical scaling beats the complexity price of distributed data.

[20220803] The Five Ways to Deploy Microservicesthe five ways of running microservices: (1) Single machine, multiple processes: buy or rent a server and run the microservices as processes. (2) Multiple machines, multiple processes: the obvious next step is adding more servers and distributing the load, offering more scalability and availability. (3) Containers: packaging the microservices inside a container makes it easier to deploy and run along with other services. It’s also the first step towards Kubernetes. (4) Orchestrator: orchestrators such as Kubernetes or Nomad are complete platforms designed to run thousands of containers simultaneously. (5) Serverless: serverless allows us to forget about processes, containers, and servers, and run code directly in the cloud.

[20220728] You Don’t Need a Monolith – lots of arguments against, mostly “you get all this power and complexity but you probably don’t need or want it.”

[20220719] Improve Your Application TestingUse template system allocation scripts; Automate resource allocation and deallocation; Perform regular refreshes of test data.

[20220718] 14 Software Architecture Patternscircuit breaker, client-server, command query responsibility segregation, controller-responder, event sourcing, layered, microservices, model-view-controller, pub-sub, saga, sharding, static content hosting, strangler, throttling.

[20220718] 12 Ways to Prepare Your Monolith for Microservices – Ensure you know what you’re getting into; Make a plan;Put everything in a monorepo; Use a shared CI pipeline; Ensure you have enough testing; Install an API Gateway or HTTP Reverse Proxy; Consider the monolith-in-a-box pattern; Warm up to changes; Use feature flags; Modularize the monolith; Decouple the data; Add observability.

[20220624] Serverless Reference Architectures — for AWS but will work in general.

[20220624] Serverless Microservices Patterns for AWSThe following 19 patterns represent several common microservice designs that are being used by developers on AWS.

[20220624] 7 Ways to Fail at Microservices — distributed != decoupled. (A conference talk, on YouTube)

[20220624] Infrastructure Efficiency — what to worry about at each stage of startup life.

[20220529] Learnings from 5 Years of Tech Startup Code Auditssimple outlasted smart; never deserialise untrusted data; flaws in business logic are guaranteed to affect the business; Almost no one got JWT tokens and webhooks right on the first try; Our highest impact findings would always come within the first and last few hours of the audit. (discussion)

[20220527] Refactoring and Optimizing a High-Traffic API at PayPal – identifying bottlenecks, segmenting, etc.

[20220526] Advice from 10 Years of Software Engineering – good advice. I particularly liked: Be a caretaker, rather than an owner; Write code specifically for the problem at hand, but try to spot places where you can afford to make it a little generic; Defining what is “done” is time-saving because it helps you estimate the effort required, plan for development, and avoid unnecessary revisions later; A single large release may be divided into a series of lower-risk well-understood rollouts; Coordinate reviews for the design doc and compare the design as it evolves with the original doc to verify that all the relevant constraints are being addressed.

[20220526] How We Deploy to Production Over 100 Times/Dayoptimise the developer workflow for rapid delivery, and this leads to a reduction in risk too; Small revertable changes; change management process is surprisingly light on human touch points; invest heavily in out-of-the-box guardrails, monitoring, and auditing; We associate “tiers” with each service (how critical they are) and warn when PRs make changes to critical ones. This helps engineers get a sense of the relative risk of a change; We bias towards a simple and opinionated user experience, over configurability; The workflow is consistent and familiar; An unintended side effect of reducing friction from the deployment process is we see too many changes being deployed to staging for testing. This increases the number of manual steps engineers take in order to test out changes.

[20220524] Real World Event-Driven Architecture – examples of solving problems with events.

[20220524] The Balance has Shifted Away from SPAs – for simple things. Some scenarios still need MPAs.

[20220524] Application Holotypes – short on enterprise code, but still interesting.

[20220524] Quick Fixes to your Code Review WorkflowSimplify the decision of when to review; Distinguish comments from requests; Use pair reviews; Use more and more junior reviewers; Teach people how to do code review; Designated second reviewers; Decide what review is actually for; Adopt a collaborative attitude; Actually figure out what’s wrong with your code review.

[20220524] Simple Software Things That Are Actually Very Complicated – nothing is easy! If the browser – or in other environments, the operating system – can do something for you, chances are you definitely want to let it handle that for you. Otherwise you’ll learn the hard way just how much work goes in to it.

[20220524] The Tyranny of ‘What if it Changes?’ – it’s speculation, divorced from reality, and used as a conversation ender.

[20220519] Stop Aggregating Away the Signal in Your Data – discusses three alternatives to aggregation: Rearranging the data to compare “like to like”; Augmenting the data with concepts that matter, like “summer” vs. “winter” or data-defined categories like “high” or “normal” energy usage; Using the data itself as context by splitting the data into “foreground” and “background,” so the full dataset provides the context necessary to make sense of the specific subset of the data we’re interested in.

[20220504] Philosophy of Software Design – summary of Ousterhout’s book. VERY good.

[20220504] Web Release at Massive Scale – Facebook employees dogfood Facebook changes, which functions as QA.

[20220504] Spotify System Architecture – once over lightly, but it’s got all the rectangles in the architecture diagram.

[20220504] Guide to Protecting Your APIs with OAuth2 – that’s part 1, see also Part 2 The Authorization Code grant (in excruciating detail).

[20220504] Max ConnectionsFor a server listening on a port, each incoming connection DOES NOT consume a port on the server. The server only consumes the one port that it is listening on.

[20220504] Design for Scale and High Availability – a look at Google’s infrastructure, with insights into team composition and how they view incidents.

[20220502] 12 Factor Apps in 2022We end up with an “app” level and a “service” level and we have to be careful when considering the original text, since it talks only about the app level, but some of its concerns now reside at the service level.

[20220502] Distributed Systems Shibboleths – nice explanation of the building blocks of distributed systems, with a smidge more meat than most overviews.

[20220421] The Big TDD MisunderstandingOriginally the unit in “unit test” did not refer to the system under test but the test itself. Meaning the test can be executed as one unit and does not depend on other tests to run upfront. Huh.

[20220421] Microservices Observability Design PatternsHealth check API; Log aggregation; Distributed tracing; Exception tracking; Application metrics; Audit logging.

[20220421] 10 REST Commandments(1) use JSON; (2) use the right HTTP methods; (3) use nouns instead of verbs, plural instead of singular; (4) use SSL, use authorisation, don’t send data you don’t need, use an API gateway to block DDoS etc; (5) version your API; (6) be consistent; (7) use HTTP status codes, and a universal error message structure; (8) handle complex processes inside the API, not requiring the cooperation of the client; (9) optimise database, cache, send only data that’s needed, enable compression; (10) document and otherwise help new users of the API.  

[20220421] When Feature Flags Do and Do Not Make SenseFeature flags are a poor man’s alternative to binary rollbacks, and they definitely aren’t a substitute for having a great automated test suite and a robust QA process. And each feature flag immediately doubles the universe of corner cases that your programmers have to understand, and your code is required to handle.

[20220421] Narrow Waists – narrow waists are interop systems that let N systems talk to M systems with NxM translation systems being necessary. In distributed systems, types are local illusions. Big systems can’t be upgraded atomically. They’re often running a mix of inconsistent schema versions. When models and reality collide, reality wins.

[20220421] Simple ArchitecturesThe cost of our engineering team completely dominates the cost of the systems we operate. […] By keeping our application architecture as simple as possible, we can spend our complexity (and headcount) budget in places where there’s complexity that it benefits our business to take on.

[20220406] Cohesion and Coupling with Examples – very good.

[20220404] Authorisation in a Microservices WorldFine-grained authorization in microservices is hard. Definitely not impossible, but hard. You would expect that a more standardized, all-around, full-proof solution is out there, but I am afraid there isn’t. It’s a complex matter and depending on what you are building, implementation varies.

[20220404] BBC’s Year of Serverless – no outages caused by the serverless platform. They did notice noisy neighbours (other customers on the VM) reducing BBC’s performance on hour and quarter hour.

[20220404] Networking at Slack Retrospective – again I say: networking is HARD. To make matters worse, applications interpret these variables differently too. For example, Ruby and Golang support CIDR blocks in the `no_proxy` variable while Python does not.

[20220404] On Scalable Software – a mini-bootcamp on scaling with distributed systems. Also: If we know that the workload is going to increase, we should go for a scalable design up-front. But if we are not sure of what the future looks like, a performant but non-scalable solution is a good enough starting point.

[20220404] Distributed Ingestion at Sentry – they build a CDN but to accept uploads, not cache downloads. This caught my eye: Have you ever seen unexpected 50X responses from your reverse proxy servers with no obvious explanation? You might be a victim of inconsistent idle timeouts between your upstream and downstream applications.

[20220330] Enterprise Platform API Governance – common sense behind the big words:

  • Info: Ensure that there are title, description, and other essential information properties.
  • Versioning: Require a standard semantic or date-based versioning applied to each API.
  • Operations: Make sure each individual operation has a summary, description, and id.
  • Parameters: Standardize the format of parameter names, and all have descriptions.
  • Responses: Push for a common set of status codes, media types, and schema responses.
  • Schema: Standardize all request and response schema using JSON Schema components.

[20220330] Using HTTPS in your Development Environment – what it says on the box. More nodey than applies to us.

[20220329] Penny Wise and Cloud Foolish — deep dive into Google Cloud pricing.

[20220324] How To for the Microsoft Authentication Platform – including delegated and app permissions.

[20220323] Guidance for Architecting Mission Critical Apps on Azure – on the one hand this is good architecture, on the other hand it’s All The Expensive Toys.

[20220323] Bash Best Practices – quoting, readability, tests, don’t do this, how to debug.

[20220323] Bash Pitfalls – common mistakes from the real world.

[20220323] Defaults Matter – “Even the most well-intentioned people have a finite amount of energy to fight against a system that pushes toward bad outcomes.”

[20220322] Continuous Architecture – (1) Architect products; evolve from projects to products; (2) Focus on quality attributes, not on functional requirements; (3) Delay design decisions until they are absolutely necessary; (4) Architect for change—leverage the “power of small.” Big, monolithic, tightly coupled components are hard to change; (5) Architect for build, test, deploy, and operate; (6) Model the organization of your teams after the design of the system you are working on.

[20220321] Detecting Silent Errors – a mix of opportunistic and ripple testing. Opportunistic = take a long time while the server is out of operation for other reasons. Ripple = inject smaller known workload with known outputs regularly, rotating around the fleet.

[20220315] API Debugging Tips – isolate the API issue, examine the HTTP response code, then dig into the data causing it.

[20220311] Google’s API Design Guide — a general design guide for networked APIs. It has been used inside Google since 2014 and is the guide that Google follows when designing Cloud APIs and other Google APIs. This design guide is shared here to inform outside developers and to make it easier for us all to work together.

[20220311] API Improvement Proposals — Focused design documents for flexible API development.

[20220310] Eventual Consistency – how it happens with clusters, event sourcing, and async processing. And how to handle it: server wait, client polling, push to client, read from primary after you write.

[20220309] Reproducible Builds – Repeatable Builds + Immutable Environments + Source Availability

[20220309] One Way Smart Developers Make Bad Strategic Decisions – 

[20220308] Evolve APIs — use versioned endpoints, and something like APISIX as a router to make old endpoints work.

[20220308] The 10 REST Commandments — some good advice.

[20220308] Payment Orchestration at AirBnB — how they rearchitected payments as microservices. Every major workflow is divided into a directed acyclic graph (DAG) of retryable idempotent steps, each with well-defined behavior. This allows the payment orchestration layer to maintain eventual consistency with other key services (such as the payment gateway layer and product fulfillment services). This approach has led to five 9s (99.999%) of consistency for payments.

[20220308] Pair Programming Antipatterns – not “you understand this, right?” but “which part of this is hardest to follow?”.

[20220306] Making CRDTs Byzantine Fault Tolerant — CRDTs let many nodes get the same data  in distributed systems. But what if a number of adversarial nodes join the network and flood it with bad data? This paper offers a solution, making them Byzantine-fault-tolerant.

[20220306] Resumption Strategies After Interrupted Programming Tasks — return to the last method edited; navigate to remember; task tracking; review source code history. Some facts about interruptions: 57% of tasks are interrupted, 40% of interrupted tasks are not resumed; developers typically require about 15m to recover from an interruption.

[20220303] WASI – WASM at the edges (of the middle), from Fastly. 

[20220224] Server Sent Events, WebSockets, and HTTP – a look at all three ways of doing pub/sub over the web.

[20220220] Open Telemetry — use it, it’s an upfront cost in complexity that pays off by giving you lots of tools to choose from.

[20220219] Pushing the 100k Button – AWS S3 storage balloons, and using policy to expire content isn’t easy. HN comments were interesting: GDPR gives a hammer against PMs who want to indefinitely retain data.

[20220219] Comparing Git Workflows – centralized workflow, feature branch, gitflow, forking.

[20220219] Open Policy Agent – OPA is an open-source project that implements a single policy language and policy engine that can be applied to solve policy and authorization problems at every layer of the cloud-native stack. It uses a purpose-built declarative language, Rego, to express sophisticated logic over complex hierarchical data structures.

[20220219] Deterministic Lockstep and Secret State – game networking might have something to teach us about our distributed app state problem.

[20220218] My Favourite Debugging Hack – printing a variable with line number and filename.

[20220217] Building for 99% Developers — accept slow migrations; accept legacy subsystems. Most companies don’t have a FAANG’s technical needs nor number of developers. And most places that talk up their shiny new shit have big mounds of rusty old shit too.

[20220216] The Absolute Minimum That Every Developer Should Know About Unicode and Character Sets – what it says on the tin.

[20220216] Test Data Management – interesting description of the problem of managing data in end-to-end tests, and the open source system they built to manage it.

[20220213] Server-Sent Events instead of Websockets — and from a discussion: it’s a beautifully simple & elegant lightweight push events option that works over standard HTTP, the main gotcha for maintaining long-lived connections is that server/clients should implement their own heartbeat to be able to detect & auto reconnect failed connections which was the only reliable way we’ve found to detect & resolve broken connections.

[20220213] The Kobiyashi Maru of Comparing Dates with Times — if you’re ever saying “is 2022-02-12 < 2022-02-12 03:00” then you are doing it wrong. But you will probably have to at some point. And there’s no right answer.

[20220213]  A Decade of Major Cache Incidents at Twitter — amazingly transparent incident information. Painful how many of the problems happen below the application level.

[20220209] Simplifying Your Codebase When Using Feature Flags — Remove old feature flags as soon as possible; Your flag’s “off” position should be stable; Feature flags that aren’t temporary aren’t feature flags; Test one thing at a time; Create duplicates of components for easier removal.

[20220209] Tough Questions to Ask When Making Decisions — is it reversible, does it paint us into a corner?

[20220206] Five Design Patterns for Building Observable Services — Outside-In Health Check, Inside-Out Health Check, Real time Alerting, Troubleshooting.

[20220205] Don’t replace X with Y unless you expect 10x improvement — “so cvs to git, but not MySQL to PostgreSQL” (Charity Majors rule)

[20220205] Applying Event-Driven Architecture in Digital Transformation Projects – more detail than is usually shown.

[20220203] Testing Distributed Systems – papers, books, apps, resources. (tl;dr – is hard)

[20220203] Creating the Conditions for Developer Happiness – avoid handoffs of work between teams, minimise “time to login screen” ie startup time for new developer. Most development tasks are small enough that developers can quickly and continuously flow from small unit tests to completed code and on to the next task. Feedback cycles on the code are quick. Unit tests can cover small areas of the code while still providing value such that it’s rare that a developer needs to use a debugger to understand and solve problems.

[20220203] Event Sourcing, and Event Driven Architecture – two different things. Event Sourcing = saving your data as events not current state; Event Driven Architecture = building system around messages.

[20220203] Modern Identity Onboarding Essentials – tradeoffs of where personal info is kept, account linking, etc.

[20220203] Domain Driven Refactoring – walks us through writing simple procedural code and then pulling out services, collections, etc.

[20220202] Guide to Building Context Menus – interesting look at UI design.

[20220201] Boring Technology Checklist – has more detailed checkboxes under familiarity, stability, reliability, well-understood limits & trade-offs.

[20220201] Tips on Prioritising Technical Debt – “boy scout rule” = leave codebases and systems in better shape than you found them, so refactor when you touch a part of the codebase.

[20220131] Technical Debt Gets Worse Before It Gets Better — The siren song of the ground-up rewrite calls to us, trying to make us forget the fact that it’s much harder to replace an existing system than it is to create one from scratch. You have to keep supporting old APIs until every client has moved to the new APIs. Existing data has to be migrated to new models. You have to research (and often replicate) the quirks and undocumented features of the existing system. 

[20220131] What’s in a Good Error Message? — context, the error itself, and mitigation.

[20220131] Feature Flags — develop in main, deployment and release are different.

[20220131] The Legacy Mimic pattern — assumes you know what your interface IS, of course …

[20220128] The baseline for web development in 2022 – low-spec Android devices in terms of performance, Safari from two years before in terms of Web Standards, and 4G in terms of networks. The web in general is not answering those needs properly, especially in terms of performance where factors such as an over-dependence on JavaScript are hindering our sites’ performance.

[20220128] Automerge – conflict-free replicated data types in Javascript for building distributed apps, which lets you have local-first software. Not suitable for banks.

[20220128] Foundations of Computer Science – book has been taken out of print. Nothing much has changed since it was published in 1992.

[20220128] My Goal is to Ship – 25 years of software engineering. “Even with years of newfound tact, what frustrates me profoundly are projects that sit on shelves. I’ve learned to push them to a conclusion, negotiating their way out the door. If I sensed a project was permanently stuck, I’d leave it behind, despite others who chip away at it.”

[20220128] Boki – new serverless runtime that exports a shared log API to serverless functions. “The key enabler is the metalog, a novel mechanism that allows Boki to address ordering, consistency, and fault-tolerance independently.”

[20220201] Tech debt at the scale of a super app

[20220201] Next-auth.js – built in support for google, etc., BYO backend, built-in email/magic link etc, works with cookies

[20220201] Extend-Only Design

  • breaking changes have one thing in common: something that existed in the past was irrecoverably altered in a future release, and thus broke the downstream applications and libraries that depended upon that prior functionality.
  • The pillars of extend-only design apply to any and all software systems, and they are thus:
    • Previous functionality, schema, or behavior is immutable and not open for modification. Anything you made available as a public release lives on with its current behavior, API, and definitions and isn’t able to be changed.
    • New functionality, schema, or behavior can be introduced through new constructs only and ideally those should be opt-in.
    • Old functionality can only be removed after a long period of time and that’s measured in years.
  • Extend-only design requires some more planning and perhaps some more flexibility on the application-programming side of things

[20220201] Why our team cancelled our move to microservices

  • Were heavily reliant on a third party, which limited their ability to divide into microservices.
  • Couldn’t sufficiently isolate the microservices. “We couldn’t identify any obvious candidates in our monolith to be broken out into a microservice. So instead, we started drawing arbitrary lines between our domain models, and from this, we had the list of microservices we were to create. However, once we started investigating, we found a lot of shared business logic and implicit coupling between the soon to be separate microservice domains. Some further attempts were made to subdivide these microservices into smaller and smaller pieces, but that left us with even more coupling, messages buses everywhere, and a potential big bang of immediately going from one service to ten or more microservices.”
  • Couldn’t assign ownership of any potential microservice to any team, because all teams touched all microservices for their features.
  • Platform wasn’t ready.
  • It wasn’t uncommon for requirements to change mid feature. This uncertainty made creating microservices more fraught, as we couldn’t predict what new links would pop up, even in the short term.
  • We had a tiny window, just large enough split our monolith into the list of microservices we had been given. What we didn’t have was any extra time to allow us to reflect on what we had created or alter course if required.
  • None of the people responsible for architecting or implementing the microservices architecture had any specific prior experience
  • We didn’t have a list of our pain points, and we had no clear understanding of how this would help solve any pain points we do have.
  • Adopting microservices isn’t free. There is a vast list of additional concerns that you need to address. We would need to revisit many concerns that we had previously addressed in our monolith. For example, we would need to address or revisit: logging, monitoring, exception handling, fault tolerance, fallbacks, microservice to microservice communication, message formats, containerization, service discovery, backups, telemetry, alerts, tracing, build pipelines, release pipelines, tooling, sharing infrastructure code, documentation, scaling, timezone support, staged rollouts, API versioning, network latency, health checks, load balancing, CDC testing, fault tolerance, debugging and developing multiple microservices in our local development environment.
  • We were using monolith like a loaded term. As if saying “monolith” implies something terrible, and “microservices” implies something good. Once we looked past the stereotypes and branding, the development team had very few issues with our “monolith.”
  • The more we looked into microservices, the more it seemed that it was less about technology and more about structuring teams and the work that came into them. Had we made a mistake approaching microservices as purely a technology problem?

[20220201] The Art of Immutable Architecture: Theory and Practice of Data Management in Distributed Systems – “Most software components focus on the state of objects. They store the current state of a row in a relational database. They track changes to state over time, making several basic assumptions: there is a single latest version of each object, the state of an object changes sequentially, and a system of record exists. This is a challenge when it comes to building distributed systems. Whether dealing with autonomous microservices or disconnected mobile apps, many of the problems we try to solve come down to synchronizing an ever-changing state between isolated components. Distributed systems would be a lot easier to build if objects could not change.” … $108 to buy!!!

[20220201] Store SQLite in Cloudflare Durable Objects – and HN discussion

[20220201] Textbook for system design (WIP)

[20220201] System Design Primer (for web-scale systems)

[20220201] Self-hosted architecture

[20220201] Give me events not webhooks

[20220201] Server Sent Events

[20220201] Learn how to document APIs

[20220201] UUIDs bad for MySQL performance

[20220201] API Design Patterns book

[20220201] Hoppscotch API development

[20220201] The API Book (github)

[20220201] Flutter’s short-coming as xplatform

[20220201] Common Infrastructure Errors

[20220201] Best Practices can slow your application down

[20220201] Things your CS degree didn’t prepare you for but which you must do as a developer (Twitter)

[20220201] Data Grid Performance Comparison

[20220201] Model View Update

[20220201] single-tenancy vs multi-tenancy discussion on HN

[20220201] Snowflakes Servers and Snowflakes as Code

[20220201] Microservices lessons

[20220201] Shopify’s Monolith

[20220201] Interface Segregation Principle

[20220201] 5 Tips for API Design Reviews

[20220201] API Lifecycle Blueprint

[20220201] Service Locator is not an Antipattern

[20220201] REST API Design

[20220201] 8 Simple Tips for using an API in a mobile application – from SyncFusion

  1. Check internet connectivity before an API call
  2. API versioning
  3. Show loading/progress until API responds
  4. Use the status code to display a meaningful error message on failures.
  5. Paginate and lazy load
  6. Code for case where a logged-in user expires.
  7. Plan for forced updates.
  8. Send network type in API request.

[20220201] Consider SQLite – why and how to use SQLite in real situations.

  • It’s only when you start needing significantly more than 99.999% uptime that you start to need truly distributed systems
  • Litestream and Verneuil replicate your SQLite database to other machines.
  • there’s no per-request client/server roundtrip latency
  • But …
    • the type system is bad
    • Support for migrations is worse
    • Significantly harder to geoshard your app
    • Not well supported by some web frameworks
    • Heavy computations or slow language might make a single server hard
    • Failure modes of a single thread are different than connection pools, so architect for it.

[20220201] Abstractions in Software Development

  • Good explanation of the SOLID principles
  • Stable Dependencies Principle = a package should only depend on packages that are more stable than it is.

[20220201] Liskov Substitution Principle – subtypes must be substitutable for their base types.

[20220201] Challenging the Pull Request orthodoxy – Ship vs Show vs Ask

[20220201] Microservice Strangler Pattern

[20220201] Batching and Caching with Dataloader – GraphQL way to avoid the N+1 problem (you make one query to determine which things to show and then, for each result in that query, make another query to fetch the information about that thing)

[20220201] Why do you need to move from CRUD to Event Sourcing? – ES seems optimised for “write a lot, read rarely” because you need to replay events to get to current state. Is there a hybrid? Ah yes, there is. It’s CQRS pattern – maintain current state in a reporting database and preserve events in the writing database. I wonder how multistep transactions work – are they events handled atomically by the processing code?

You must be logged in to post a comment.