Build tooling: More tools and more problems

TLDR

This chapter covers the complex ecosystem of JavaScript build tools and infrastructure, explaining how modern web development relies on various compilers, bundlers, and package managers to transform source code into production-ready applications. It traces the evolution of JavaScript tooling from simple script tags to today's sophisticated build pipelines, covering topics like TypeScript and Babel for compilation, Webpack and Vite for bundling, and npm and Yarn for package management.

The JavaScript ecosystem is a reminder that the web is held up by shoestrings and gum; it is this immense mess of technologies, all haphazardly thrown together, that somehow happens to work just enough to get your website online.

I have no doubt that this heavily-customized, and poorly abstracted, tooling is one of the many things that scares away the casual developer from exploring frontend development. I hope to provide you with just enough context to see the forest among the trees, and let you resume your journey in frontend development.

Writing and running code

As the web has gotten more complex, JavaScript has turned from a simple scripting language—primarily embedded in HTML—to a full blown language. Its simple origins necessitated extensive tooling to keep up with the explosive adoption and ever-more-complex use-cases.

Languages

While JavaScript, the runtime target, has experienced incredible growth, people have done a lot to avoid writing JavaScript, the language. For example, CoffeeScript was pure syntactic sugar over JavaScript to emphasize the functional programming features of the language.

Beyond syntax, as JavaScript codebases grew, they needed static typing:

Flow : built by Facebook, one of the earliest attempts to add type annotations to JavaScript.
TypeScript : the most popular way to add types to JavaScript. Built by Microsoft, it has become the defacto way to add types to JavaScript with support across the ecosystem.

A second transformation of the language happened with React. As mentioned last chapter, a core innovation of React was .jsx files: allowing developers to write HTML in JavaScript. JSX used a transpiler to transform JSX files into JavaScript files by transforming the HTML-like syntax into JavaScript function calls. Specifically, the transpiler can take

example.jsx


function Component() {
  return <h1 className='title'>Hello, world!</h1>;
}

… and transform it into something like this:

example.js


function Component() {
  return JSX.createElement(
    'h1',
    {className: 'title'},
    'Hello, world!'
  );
}

There is also a tsx file type: TypeScript with HTML-like syntax.

Compilers and transpilers

Supporting those different languages are compilers and transpilers. In the world of frontend web development, the transpiler serves 4 key purposes:

Types: Supporting static typing, probably through TypeScript
JSX: Supporting JSX / TSX
Supporting different runtimes: these days, JavaScript runs everywhere—browsers (including different versions), the server, in desktop apps (via Electron), in CLI tools, etc. Those different runtimes require slightly different formats (more on that in the next section), and the compiler can abstract this away from the developer.
Polyfills: Because different runtimes and different versions support different language features, the compiler can detect which features you need and inject any necessary polyfills.

Common compilers are:

TSC : Ships with TypeScript. Compiles TypeScript to JavaScript.
Babel : A pluggable compiler. Currently the most popular.
SWC : A replacement for Babel written by Vercel intended to be faster.

Use of these compilers and transpilers are not mutually exclusive. It’s not uncommon to see projects use both TSC and babel in the same build step.

Linters

Lastly, sizable codebases want to ensure that developers are following common formatting. ESLint is the most popular linter, with many community-built plugins .

While linters can catch many code-formatting bugs, Prettier is a very popular code formatter. It is typically used along-side Eslint.

Build targets

Because JavaScript runs everywhere, new platforms continue to create support for JavaScript. It is a self fufilling prophecy.

Runtimes

While JavaScript does have a language spec: there are different runtime engines (with partial or extended implementations of the language), different versions, and different global variables available at runtime; as a result, you need to know what platform you’re building your JavaScript program for.

The most common place to run JavaScript is the browser. Browsers consist of two parts: the UI (how you navigate to web pages) and the browser engine, the part that takes web pages and runs / renders them; while there are many browsers, there are only a few browser engines . Three key JS engines to be aware of are:

Chrome: supported by the V8 JavaScript engine , maintained by Google
Firefox: supported by the SpiderMonkey JavaScript engine , maintained by Mozilla
Safari: supported by the JavascriptCore engine (part of WebKit ), maintained by Apple

As code running in the browser, any scripts have access to the document and window global variables.

For a long time, those were the only platforms most developers JavaScript authored JavaScript for. Then Node came around and changed everything. Node took Chrome’s JavaScript engine (V8) and created a set of APIs intended to run JavaScript outside the browser’s sandbox. To this day, Node is the most popular way to author a JavaScript web server or run JavaScript scripts locally. It introduces a few global variables (including global, process, etc.) because the window and document APIs don’t make sense when running outside of a web browser.

Because users sometimes need to test or run code against different versions of node, nvm is a tool that lets users manage multiple node installs.

While node is definitely still the most popular runtime for non-browser-based code, more recently we’ve seen the creation of new runtimes, including Deno and Bun , that try to improve upon the speed, size, and developer experience of node.

For developers writing “native” (not browser based) applications, Electron is a platform built on top of Node and Chromium (the open source foundation of Chrome) that allows developers to write desktop apps in JavaScript. To enable that, it introduces a suite of APIs to access the native platform.

And for developers writing mobile apps, React Native is the latest and most popular runtime allowing developers to write their mobile apps with web technologies. React Native used to use the platform’s native JavaScript engine (V8 on Android and JavascriptCore on iOS), but more recently they’ve developed their own runtime, Hermes . Much like Electron, it introduces a handful of APIs for communicating with the host device. Older technologies that tried to make web technologies available to mobile developers include NativeScript and Apache Cordova .

Module systems

Due to its simple origins, JavaScript never had a good way to split out code across files. It wasn’t intended for applications of that scale; as JavaScript broke out of the browser, this had to change, and many people rushed to create a solution. This led to many, competing solutions to the problem of “how do I import one JavaScript file from another”.

These module systems have mostly been pared down to 2:

CommonJS (CJS): The most popular standard to date.
ECMAScript modules (ESM): The newer standard introduced as part of the JavaScript spec. This will eventually become the standard; it just takes time.

Older technologies that are no longer common are:

Asynchronous module definition (AMD): Created for a polyfill that is no longer necessary.
Universal module definition (UMD): Created to unify the other module definition syntaxes. Relevant XKCD .

Introduced by Node, CommonJS allows developers to assign variables and functions to an automatically created module.exports object (e.g. module.exports.title = "Hello, world!"), and then import files via require("path/to/file.js"). The contents of that file is the wrapped in a function and executed, returning the resultant value of module.exports. This object is a singleton (so requiring the object multiple times will return the same object).

ECMAScript modules improve on this standard in a few key ways:

Exports are now done via an exports keyword (export const title = "Hello, world";)
Imports are done via an import keyword (import { title } from "./path/to/file.js")
Promises can be awaited at the top level

Because ECMAScript modules are a superset of CJS modules, ESM files can import CJS, but not the other way around. Sometimes authors will denote this by naming ESM files as *.mjs, although I do not recommend this because ESM will become the standard after a few years.

Bundlers

While modules, predictably, became a very popular way to organize code, web browsers never added first party support for them. The result was a need for packagers: a way to take all of the modularized code and concatenate it into one file.

The most popular packager is Webpack , which not only bundles, but has an extensive plugin ecosystem .

That said, like everything else in this ecosystem, there are many alternatives. Some older ones include:

Browserify : early to the game, was very popular.
Metro : used by React Native.

More recently, there have been a class of new entrants, all targeting the same problem: improving upon Webpack’s (notably bad) performance:

Ultimately, the bundler is a glorified version of cat, with a few extra features to ensure things like:

Dependencies are correctly ordered
Any unreferenced code is removed (this is called “tree shaking”)
Files are split up to optimize for browser caching (e.g. if you depend on a library that is unlikely to change, output a separate file so the browser can cache it across version bumps of your application)
Node globals are shimmed to work in the browser (a common one is to replace process.env.FOO, the way you access the env variable FOO with whatever the value of FOO is at build time)
Minifying code to save network bandwidth (for example, comments don’t need to be sent to the browser and whitespace can be removed)

When code is minified, most tools will generate source maps to help you symbolicate stack traces. Make sure not to ship your source maps to production or users will be able to decompile your code.

Orchestrators

That’s a lot of tools. Running them all in concert is difficult. The result is a set of tooling intended to help you coordinate all the other tools. Other languages and ecosystems use Make , some JavaScript projects do to, but as is customary here, we also have our own, bespoke, solutions.

Grunt and, later, Gulp were introduced to help transform files. These days they’re less frequent (as the packagers have subsumed much of this functionality), but still exist.

The advent of modules and packages (more on that in a second) lowered the barrier to share and reuse others code, and led to an explosion of JavaScript libraries.

Libraries and frameworks

While modules certainly lowered the barrier to share, people have been sharing JavaScript forever. These days, most libraries can be found in the NPM package registry . And range from full fledged frameworks (you can install Electron that way) to libraries that add whitespace to strings .

There are too many libraries to go through my favorites, so instead I’ll try to reference them as they’re relevant in later chapters. Three general pieces of wisdom about them are:

While libraries are really easy to add and appear really friendly, make no mistake, you are introducing someone else’s code to your codebase. That code could be buggy, built with a different set of constraints, or plainly insecure. Because JS developers are so quick to introduce packages, software supply chain security is especially difficult in JavaScript.
Speaking of different constraints, don’t be afraid to peak into how a package is implemented. Far too often I hear developers say “well I can’t do that because the package does something close to what we need, but not exactly what we need”. Many packages are less than 100 lines, it’s frequently easier and more maintainble to maintain that yourself than to hack around someone else’s constraints.
While many packages are written, far fewer are maintained. Part of the reason you’ll hear me reference who authors which library is because I have far more faith that a tool built and used by Facebook will be maintained than one built by a no-name developer with no incentive to do so.

Shims and polyfills

One important class of libraries are shims and polyfills. But before we talk about them, we need to talk about the governance of the web.

The web standard is officially maintained by the W3C , a standalone group created to steward the protocols of the web; however, because browsers are maintained by various organizations, which APIs you can use depends on which parts of the spec the browser vendor has implemented. In addition,browser vendors (specifically Chrome and Safari) will implement APIs before they’re standards as a way to serve the needs of their parent organizations (Google and Apple respectively) and as a way to push the web forward.

CanIUse maintains a set of compatibility tables for various features across browser vendors and versions.

To deal with the inconsistency of runtimes, websites use one of three strategies:

Browser detection: If you’ve ever seen the message “Please open this site in Chrome or Firefox for the best experience”, you’ve experienced browser detection. The website is detecting which browser you’re using, and gating the experience based on it, probably because the website requires some not-yet-widely-supported API. While common, this is broadly considered bad practice because it excludes niche browsers, excludes future browsers, and is unlikely to be updated once the new version of a currently-incompatible browser adds all the necessary APIs to support the site.
Feature detection: The better version of browser detection is feature detection, which simply asks “does the current browser support the specific feature I need” (for example, if you’re trying to use the window.querySelectorAll() function, you can test if window.querySelectorAll is a function; because in incompatible browsers it’s not even defined). A popular library for this is Modernizr .
Polyfills (aka Shims): The even better version of feature detection is to not only detect if the browser supports a feature, but to retrofit it when it doesn’t. As I said, JavaScript is an incredibly flexible language and even lets you add new functions to base types (for example, you can add a method to the base array type). Polyfills will detect if a browser has an API, and if not, define it for you; for example, the .forEach() method was introduced on arrays, but before every browser had it, websites might include a piece of code that checked if forEach existed on the base array type, and if not, define it .

Evolving the langauge

In addition to frameworks and polyfills, there are a handful of libraries dedicated to adding common utility functions to JavaScript. This is common in various languages; a popular example here is underscore.js .

While the bareness of JavaScript can be disheartening to developers coming from more full-featured languages, I’ll leave you on a positive note:

As one who’s been watching the language and browser runtime evolve, we are making strides. For example, when jQuery was popular, selecting elements in the DOM was difficult. The author of jQuery built Sizzle , a library that abstracted away selecting elements and let developers use CSS selectors instead. Since then, browsers have added native support for this pattern with document.querySelectorAll .

And it doesn’t stop there, we’re watching the same thing play out with web components and shadow DOM : browsers are adding native support for the component architecture that frontend libraries popularized.

Package managers

Packaged with node came a tool called npm. npm created a standard way to share code. It specified a format of a zip archive with a package.json file at the root that specified any dependencies and the entrypoint to the code. It also provided an online regsitry to share these packages, that could easily be published to (using npm publish) or pulled from (using npm install PACKAGE_NAME).

NPM is still the largest, and most common, registry; although some companies maintain private registries to share internal packages.

That said, NPM is no longer the only package manager, and there are now other tools that help manage packages locally while still publish to and pulling from npm’s online registry. PNPM is a tool developed as drop-in replacement for NPM, with one core advantage: it saves disk space by deduplicating shared packages across your machine.

Yarn was developed by FB with a few key benefits over NPM:

Zero install: yarn stores a copy of dependencies as a zip file that you can commit. While you might wonder if committing dependencies is advisable, it speeds up CI and protects you from remote changes in the registry (in the past, people have removed versions of packages from NPM, breaking builds, and uploaded malicious versions that get updated with little auditability).
Plug-n-play: Yarn introduced PNP as an opt-in way to save disk space, because zero install means that all packages are in zip files, why not override require() to simply reach into those zip files (which are unzipped in memory); this also saves the OS from having to page 100s-100,000s of tiny files into memory. While interesting in theory, and faster in practice, the ecosystem has not adopted it and getting tools like React Native, Webpack, or Electron to work with PNP is difficult. For this reason, I typically disable this.
Workspaces: discussed below.

Because of Facebook’s support, and the corresponding features intended for company-wide monorepos, yarn is my preferred package manager.

Workspaces (monorepos)

Packages were great, and soon people started putting many packages in a single repo. If memory serves, I think React was one of the earliest people to do this.

NPM did not have great support for importing local packages (and certainly not for developing multiple local packages simultaneously). Yarn solved this through workspaces , and soon this became a common feature across various package managers.

Along with supporting multiple local packages, developers needed a way to coordinate tasks across those packages (the most simple example being you should first build any dependencies of a package before you build the package itself). Two tools that came into existence were TurboRepo and Lerna . Because TurboRepo is supported by Vercel, it is my preferred tool.

One edge case to call out with multiple local packages is Hoisting:

For background, when you import a package in node, node climbs the directory structure checking in node_modules. For example, if I require package foo from /some/path/file.js, node will check the following locations, stopping when it finds a file.

/some/path/node_modules/foo, and then
/some/node_modules/foo, and then
/node_modules/foo

Now, imagine you have a repository with two workspaces, each which depend on react:


repository/
  workspace-a/
    node_modules/
      react/
  workspace-b/
    node_modules/
      react/

If A depends on B, compiling A will result in an artifact that contains two copies of react (as any code in workspace-a that references react will resolve to one react, and any code in workspace-b that references react will resolve to another react).

One way to solve this in a workspace is to “hoist” react:


repository/
  workspace-a/
  workspace-b/
  node_modules/
    react/

Now both workspaces will resolve to the same react, and reference the same copy of react at runtime.

If you find this confusing, it is, but it’s something to be mindful of when working with workspaces.

Scaffolders

If you had to set it all this tooling yourself, it could easily be a week of work. As a result, there are a handful of tools that help you scaffold new projects.

The most common general tool here is Yeoman , which lets developers author custom project generators . There are also application-specific scaffolders, such as create-next-app , create-react-native-app , or HTML5 boilerplate .