Stuff Internet Says About Software Development #1

This is my reading list from the past few days. I decided to put them here as it might be helpful to someone else. It was deeply inspired by HighScalability blog, a source I’ve been consuming for years.

Microsoft all over the places

Microsoft keeps its push to become a major player in the Open Source community. Let's take a look at the majestic presence they have at the media recently.

At Microsoft, 47,000 developers generate nearly 30 thousand bugs a month. These items get stored across over 100 AzureDevOps and GitHub repositories. To better label and prioritize bugs at that scale, we couldn’t just apply more people to the problem.

- Secure the software development lifecycle with machine learning

Microsoft was on the wrong side of history when open source exploded at the beginning of the century, and I can say that about me personally.

- Rust/WinRT Brings Microsoft Closer to Adopting Rust Internally

A case study will be written on how Microsoft allowed Zoom to eat their lunch. They spent millions on subterfuge trying to paint Slack as an inferior enemy when MSFT Teams actually can't do what Slack does and Teams' real competitor was Zoom. Now Zoom has 300M Daily Users. Lol.

- Chamath Palihapitiya

Rust/WinRT lets you call any WinRT API past, present, and future using code generated on the fly directly from the metadata describing the API and right into your Rust package where you can call them as if they were just another Rust module.

- Microsoft president Brad Smith, taken from Microsoft: we were wrong about open source

Rust on the Radar

As we're talking about Rust, it seems that is not only Microsoft who's investing time and effort on it.

There are many benefits a standardized ABI would bring to Rust. A stable ABI enables dynamic linking between Rust crates, which would allow for Rust programs to support dynamically loaded plugins (a feature common in C/C++). Dynamic linking would result in shorter compile-times and lower disk-space use for projects, as multiple projects could link to the same dylib. For example, imagine having multiple CLIs all link to the same core library crate.

- A Stable Modular ABI for Rust

Programming is hard.

Not because our hardware is complex, but simply because we’re all humans. Our attention span is limited, our memory is volatile — in other words, we tend to make mistakes.

- Why Rust? by Omar Faroque

The deno_core crate is a very bare bones version of Deno. It does not have dependencies on TypeScript nor on Tokio. It simply provides our Op and Resource infrastructure. That is, it provides an organized way of binding Rust futures to JavaScript promises. The CLI is of course built entirely on top of deno_core.

- Deno 1.0 by Ryan Dahl, Bert Belder, and Bartek Iwańczuk

Fowler and Friends

It looks like a busy week for Martin Fowler and his friends. New ThoughtWorks Radar was released, a few blog entries has been updated and the man himself has carved another set terms and added to his legacy to the Software Engineering.

This division of development into lines of work that split and merge is central to the workflow of software development teams, and several patterns have evolved to help us keep a handle on all this activity. Like most software patterns, few of them are gold standards that all teams should follow.

- Patterns for Managing Source Code Branches by Martin Fowler

For this Radar, we decided to call out again infrastructure as code as well as pipelines as code, and we also had a number of conversations about infrastructure configurations, ML pipelines and other related areas. We find that the teams who commonly own these areas do not embrace enduring engineering practices such as applying software design principles, automation, continuous integration, testing, and so on.

- ThoughtWorks' Technology Radar

Coming to understand the threat model for your system is not simple. There are an unlimited number of threats you can imagine to any system, and many of them could be likely. [...] Cyber threats chain in unexpected, unpredictable and even chaotic ways. Factors to do with culture, process and technology all contribute. This complexity and uncertainty is at the root of the cyber security problem. This is why security requirements are so hard for software development teams to agree upon.

- A Guide to Threat Modelling for Developers by Jim Gumbley

Other relevant quotes

Hum, let me see... What else should be mentioned?

Zoom scaled from 20 million to 300 million users</b> virtually over night. What's incredible is from the outside they've shown little in the way of apparent growing pains, though on the inside it's a good bet a lot of craziness is going on.

- A Short On How Zoom Works

Besides being an interesting approach to a very common problem, their discussion of Piranha also provides some very interesting insights into an organization that's *heavily* invested in feature flagging....

- Pete Hodgson about Uber open-sourcing Piranha , a feature-flagging tool

Deferring integration can increase the risk of merge conflicts, which causes you to move more slowly as you spend more energy addressing those conflicts. Slow change can sometimes be more risky than you expect because of the costs of extra work needed to reconcile conflicts, as well as the technical debt that results from bypassing the normal process to fix critical errors.

- Code Integration: When Moving Slowly Actually Has More Risk by Steve Berczuk

Simply put, testing in production means testing your features in the environment where your features will live. So what if a feature works in staging, that's great, but you should care if the feature works in production, that's what matters.

- Talia Nassi on Testing in Production

[...] when I was asked to reduce the resource requirements of a large MongoDB cluster, I reached the conclusion that the most obvious target - attribute names - wouldn’t lead to the kind of impact I wanted.

- Richard Startin on Shrinking BSON Documents

The most considerable impact I see is in regards to velocity. The team can focus on other business-impactful projects, rather than EKS and Kubernetes maintenance -- the undifferentiated heavy lifting is eliminated. The same reason people move from physical data centers to the cloud, or from EC2 to Serverless: offloading that effort to AWS is a very good proposition.

- Q&A on Container Scaling with Fargate

Did you know that http://pypi.org serves 800 million requests and delivers 200 million packages totalling 400 terabytes ... a day? No. Exactly. You want it to just work. Every day, rain or shine. To keep it that way: sponsor them

- Tim Head

We recently migrated a few small systems to CockroachDB (as a stepping stone). Overall, the experience was positive. The hassle free HA is a huge peace of mind. I know people say this is easy to do in PG. I have recently setup 2ndQuadrant's pglogical for another system. That was also easy (though the documentation was pretty bad). The end result is quite different though and CockroachDB is just simpler to reason about and manage and, I think, more generally applicable.

- latch

Our actual use-case is a little complex to go into in tweets. But suffice to say, the PUT costs alone to S3 if we did 1-to-1 would end up being just under half our total running costs when factoring in DDB, Lambda, SQS, APIG, etc.

- Wayne Robinson

Need operational analytics in #NoSQL? Maintain time bound rollups in @DynamoDB with Streams/Lambda then query relevant items by date range and aggregate client side for fast reporting on scaled out data. Turn complex ad hoc queries into simple select statements and save $$$

- Rick Houlihan

Another part of the solution is GPU acceleration using grCUDA — an open-source language binding that allows developers to share data between NVIDIA GPUs and GraalVM languages (R, Python, JavaScript), and also launch GPU kernels. The team implemented the performance critical components in CUDA for the GPU, and used grCUDA from Python to exchange data with the GPU and to invoke the GPU kernels.

- Optimizing Machine Learning Performance at Netsuite with GraalVM and NVIDIA GPUs

Although event-driven architecture has existed for more than 15 years, only recently has it gained massive popularity, and there is a reason for that. Most companies are going through a “digital transformation” phase, and with that, crazy requirements occur. The complexity of these requirements force engineers to adopt new ways of designing software, ones that incur less coupling between services and lower maintenance overhead. EDA is one solution to these problems but it is not the only one.

- How event-driven architecture solves modern web app problems

So, let's look at the resulting context of moving to microservices with entity services:

Performance analysis and debugging is more difficult. Tracing tools such as Zipkin are necessary.

Additional overhead of marshalling and parsing requests and replies consumes some of our precious latency budget.

Individual units of code are smaller.

Each team can deploy on its own cadence.

Semantic coupling requires cross-team negotiation.

Features mainly accrue in "nexuses" such as API, aggregator, or UI servers.

Entity services are invoked on nearly every request, so they will become heavily loaded.

Overall availability is coupled to many different services, even though we expect individual services to be deployed frequently. (A deployment look exactly like an outage to callers!)

- The Entity Service Antipattern by Michael T. Nygard

Moving all the “what does the world around me look like?” side effects to the beginning of the program, and all the “change the world around me!” side effects to the end of the program, we achieve maximum testability of program logic. And minimum convolution. And separation of concerns: one module makes the decisions, another one carries them out. Consider this possibility the next time you find yourself in testing pain.

- Ultratestable Coding Style by Jessica Joy Kerr