Build tooling: More tools and more problems
The JavaScript ecosystem is a reminder that the web is held up by shoestrings and gum; it is this immense mess of technologies, all haphazardly thrown together, that somehow happens to work just enough to get your website online.
I have no doubt that this heavily-customized, and poorly abstracted, tooling is one of the many things that scares away the casual developer from exploring frontend development. I hope to provide you with just enough context to see the forest among the trees, and let you resume your journey in frontend development.
Writing and running code
As the web has gotten more complex, JavaScript has turned from a simple scripting language--primarily embedded in HTML--to a full blown language. Its simple origins necessitated extensive tooling to keep up with the explosive adoption and ever-more-complex use-cases.
Languages
While JavaScript, the runtime target, has experienced incredible growth, people have done a lot to avoid writing JavaScript, the language. For example, CoffeeScript (opens in a new tab) was pure syntactic sugar over JavaScript to emphasize the functional programming features of the language.
Beyond syntax, as JavaScript codebases grew, they needed static typing:
- Flow (opens in a new tab): built by Facebook, one of the earliest attempts to add type annotations to JavaScript.
- TypeScript (opens in a new tab): the most popular way to add types to JavaScript. Built by Microsoft, it has become the defacto way to add types to JavaScript with support across the ecosystem.
A second transformation of the language happened with React. As mentioned last chapter, a core innovation of React was .jsx
files: allowing developers to write HTML in JavaScript. JSX used a transpiler to transform JSX files into JavaScript files by transforming the HTML-like syntax into JavaScript function calls. Specifically, the transpiler can take
function Component() {
return <h1 className='title'>Hello, world!</h1>;
}
... and transform it into this:
function Component() {
return JSX.createElement(
'h1',
{className: 'title'},
'Hello, world!'
);
}
Because of the popularity of TypeScript, there is now a tsx
file type: TypeScript with HTML-like syntax.
Compilers and transpilers
Supporting those different languages are compilers and transpilers. In the world of frontend web development, the transpiler serves 4 key purposes:
- Types: Supporting static typing, probably through TypeScript
- JSX: Supporting JSX / TSX
- Supporting different runtimes: these days, JavaScript runs everywhere--browsers (including different versions), the server, in desktop apps (via Electron), in CLI tools, etc. Those different runtimes require slightly different formats (more on that in the next section), and the compiler can abstract this away from the developer.
- Polyfills: Because different runtimes and different versions support different language features, the compiler can detect which features you need and inject any necessary polyfills.
Common compilers are:
- TSC (opens in a new tab): Ships with TypeScript.
- Babel (opens in a new tab): A pluggable compiler. Currently the most popular.
- SWC (opens in a new tab): A replacement for Babel written by Vercel intended to be faster.
Use of these compilers and transpilers are not mutually exclusive. It's not uncommon to see projects use both TSC and babel in the same build step.
Linters
Lastly, sizable codebases want to ensure that developers are following common formatting. ESLint (opens in a new tab) is the most popular linter, with many community-built plugins (opens in a new tab).
While linters can catch many code-formatting bugs, Prettier (opens in a new tab) is a very popular code formatter. It is typically used along-side Eslint.
Build targets
Because JavaScript runs everywhere, new platforms continue to create support for JavaScript. It is a self fufilling prophecy.
Runtimes
While JavaScript does have a language spec: there are different runtime engines (with partial or extended implementations of the language), different versions, and different global variables available at runtime; as a result, you need to know what platform you're building your JavaScript program for.
The most common place to run JavaScript is the browser. Browsers consist of two parts: the UI (how you navigate to web pages) and the browser engine, the part that takes web pages and runs / renders them; while there are many browsers, there are only a few browser engines (opens in a new tab). Three key JS engines to be aware of are:
- Chrome: supported by the V8 JavaScript engine (opens in a new tab), maintained by Google
- Firefox: supported by the SpiderMonkey JavaScript engine (opens in a new tab), maintained by Mozilla
- Safari: supported by the JavascriptCore engine (opens in a new tab) (part of WebKit (opens in a new tab)), maintained by Apple
As code running in the browser, any scripts have access to the document (opens in a new tab) and window (opens in a new tab) global variables.
For a long time, those were the only platforms most developers JavaScript authored JavaScript for. Then Node (opens in a new tab) came around and changed everything. Node took Chrome's JavaScript engine (V8) and created a set of APIs intended to run JavaScript outside the browser's sandbox. To this day, Node is the most popular way to author a JavaScript web server or run JavaScript scripts locally. It introduces a few global variables (opens in a new tab) (including global
, process
, etc.) because the window
and document
APIs don't make sense when running outside of a web browser.
Because users sometimes need to test or run code against different versions of node, nvm (opens in a new tab) is a tool that lets users manage multiple node installs.
While node is definitely still the most popular runtime for non-browser-based code, more recently we've seen the creation of new runtimes, including Deno (opens in a new tab) and Bun (opens in a new tab), that try to improve upon the speed, size, and developer experience of node.
For developers writing "native" (not browser based) applications, Electron (opens in a new tab) is a platform built on top of Node and Chromium (the open source foundation of Chrome) that allows developers to write desktop apps in JavaScript. To enable that, it introduces a suite of APIs (opens in a new tab) to access the native platform.
And for developers writing mobile apps, React Native (opens in a new tab) is the latest and most popular runtime allowing developers to write their mobile apps with web technologies. React Native used to use the platform's native JavaScript engine (V8 on Android and JavascriptCore on iOS), but more recently they've developed their own runtime, Hermes (opens in a new tab). Much like Electron, it introduces a handful of APIs (opens in a new tab) for communicating with the host device. Older technologies that tried to make web technologies available to mobile developers include NativeScript (opens in a new tab) and Apache Cordova (opens in a new tab).
Module systems
Due to its simple origins, JavaScript never had a good way to split out code across files. It wasn't intended for applications of that scale; as JavaScript broke out of the browser, this had to change, and many people rushed to create a solution. This led to many, competing solutions to the problem of "how do I import one JavaScript file from another".
These module systems have mostly been paired down to 2:
- CommonJS (CJS): The most popular standard to date.
- ECMAScript modules (ESM): The newer standard introduced as part of the JavaScript spec. This will eventually become the standard; it just takes time.
Older technologies that are no longer common are:
- Asynchronous module definition (AMD): Created for a polyfill that is no longer necessary.
- Universal module definition (UMD): Created to unify the other module definition syntaxes. Relevant XKCD (opens in a new tab).
Introduced by Node, CommonJS (opens in a new tab) allows developers to assign variables and functions to an automatically created module.exports
object (e.g. module.exports.title = "Hello, world!"
), and then import files via require("path/to/file.js")
. The contents of that file is the wrapped in a function (opens in a new tab) and executed, returning the resultant value of module.exports. This object is a singleton (opens in a new tab) (so requiring the object multiple times will return the same object).
ECMAScript modules (opens in a new tab) improve on this standard in a few key ways:
- Exports are now done via an
exports
keyword (export const title = "Hello, world";
) - Imports are done via an import keyword (
import { title } from "./path/to/file.js"
) - Promises (opens in a new tab) can be awaited at the top level
Because ECMAScript modules are a superset of CJS modules, ESM files can import CJS, but not the other way around. Sometimes authors will denote this by naming ESM files as *.mjs
, although I do not recommend this because ESM will become the standard after a few years.
Bundlers
While modules, predicatably, became a very popular way to organize code, web browsers never added first party support for them. The result was a need for packagers: a way to take all of the modularized code and concatenate it into one file.
The most popular packager is Webpack (opens in a new tab), which not only bundles, but has an extensive plugin ecosystem (opens in a new tab).
That said, like everything else in this ecosystem, there are many alternatives. Some older ones include:
- Browserify (opens in a new tab): early to the game, was very popular.
- Metro (opens in a new tab): used by React Native.
More recently, there have been a class of new entrants, all targeting the same problem: improving upon Webpack's (notably bad) performance:
Ultimately, the bundler is a glorified version of cat
, with a few extra features to ensure things like:
- Dependencies are correctly ordered
- Any unreferenced code is removed (tree shaking)
- Files are split up to optimize for browser caching (e.g. if you depend on a library that is unlikely to change, output a separate file so the browser can cache it across version bumps of your application)
- Node globals are shimmed to work in the browser (a common one is to replace
process.env.FOO
, the way you access the env variableFOO
with whatever the value ofFOO
is at build time) - Minifying code to save network bandwidth (for example, comments don't need to be sent to the browser and whitespace can be removed)
When code is minified, most tools will generate source maps (opens in a new tab) to help you symbolicate stack traces. Make sure not to ship your source maps to production or users will be able to decompile your code.
Orchestrators
That's a lot of tools. Running them all in concert is difficult. The result is a set of tooling intended to help you coordinate all the other tools. Other languages and ecosystems use MAKE (opens in a new tab), some JavaScript projects do to, but as is customary here, we also have our own, bespoke, solutions.
Grunt (opens in a new tab) and, later, Gulp (opens in a new tab) were introduced to help transform files. These days they're less frequent (as the packagers have subsumed much of this functionality), but still exist.
Sharing code and scripts
The advent of modules and packages (more on that in a second) lowered the barrier to share and reuse others code, and led to an explosion of JavaScript libraries.
Libraries and frameworks
While modules certainly lowered the barrier to share, people have been sharing JavaScript forever. These days, most libraries can be found in the NPM package registry (opens in a new tab). And range from full fledged frameworks (you can install Electron that way) to libraries that add whitespace to strings (opens in a new tab).
There are too many libraries to go through my favorites, so instead I'll try to reference them as they're relevant in later chapters. Three general pieces of wisdom about them are:
- While libraries are really easy to add and appear really friendly, make no mistake, you are introducing someone else's code to your codebase. That code could be buggy, built with a different set of constraints, or plainly insecure. Because JS developers are so quick to introduce packages, software supply chain security is especially difficult in JavaScript.
- Speaking of different constraints, don't be afraid to peak into how a package is implemented. Far too often I hear developers say "well I can't do that because the package does something close to what we need, but not exactly what we need". Many packages are less than 100 lines, it's frequently easier and more maintainble to maintain that yourself than to hack around someone else's constraints.
- While many packages are written, far fewer are maintained. Part of the reason you'll hear me reference who authors which library is because I have far more faith that a tool built and used by Facebook will be maintained than one built by a no-name developer with no incentive to do so.
Shims and polyfills
One important class of libraries are shims and polyfills. But before we talk about them, we need to talk about the governance of the web.
The web standard is officially maintained by the W3C (opens in a new tab), a standalone group created to steward the protocols of the web; however, because browsers are maintained by various organizations, which APIs you can use depends on which parts of the spec the browser vendor has implemented. In addition,browser vendors (specifically Chrome and Safari) will implement APIs before they're standards as a way to serve the needs of their parent organizations (Google and Apple respectively) and as a way to push the web forward.
CanIUse (opens in a new tab) maintains a set of compatibility tables for various features across browser vendors and versions.
To deal with the inconsistency of runtimes, websites use one of three strategies:
- Browser detection: If you've ever seen the message "Please open this site in Chrome or Firefox for the best experience", you've experienced browser detection. The website is detecting which browser you're using, and gating the experience based on it, probably because the website requires some not-yet-widely-supported API. While common, this is broadly considered bad practice because it excludes niche browsers, excludes future browsers, and is unlikely to be updated once the new version of a currently-incompatible browser adds all the necessary APIs to support the site.
- Feature detection: The better version of browser detection is feature detection, which simply asks "does the current browser support the specific feature I need" (for example, if you're trying to use the
window.querySelectorAll()
function, you can test ifwindow.querySelectorAll
is a function; because in incompatible browsers it's not even defined). A popular library for this is Modernizr (opens in a new tab). - Polyfills (aka Shims): The even better version of feature detection is to not only detect if the browser supports a feature, but to retrofit it when it doesn't. As I said, JavaScript is an incredibly flexible langugae and even lets you add new functions to base types (for example, you can add a method to the base array type). Polyfills will detect if a browser has an API, and if not, define it for you; for example, the
.forEach()
method was introduced on arrays, but before every browser had it, websites might include a piece of code that checked ifforEach
existed on the base array type, and if not, define it (opens in a new tab).
Evolving the langauge
In addition to frameworks and polyfills, there are a handful of libraries dedicated to adding common utility functions to JavaScript. This is common in various languages; a popular example here is underscore.js (opens in a new tab).
While the bareness of JavaScript can be disheartening to developers coming from more full-featured languages, I'll leave you on a positive note:
As one who's been watching the language and browser runtime evolve, we are making strides. For example, when jQuery was popular, selecting elements in the DOM was difficult. The author of jQuery built Sizzle (opens in a new tab), a library that abstracted away selecting elements and let developers use CSS selectors instead. Since then, browsers have added native support for this pattern with document.querySelectorAll (opens in a new tab).
And it doesn't stop there, we're watching the same thing play out with web components (opens in a new tab) and shadow DOM (opens in a new tab): browsers are adding native support for the component architecture that frontend libraries popularized.
Package managers
Packaged with node came a tool called npm
. npm
created a standard way to share code. It specified a format of a zip archive with a package.json
file at the root that specified any dependencies and the entrypoint to the code. It also provided an online regsitry (opens in a new tab) to share these packages, that could easily be published to (using npm publish
) or pulled from (using npm install PACKAGE_NAME
).
NPM is still the largest, and most common, registry; although some companies maintain private registries to share internal packages.
That said, NPM is no longer the only package manager, and there are now other tools that help manage packages locally while still publish to and pulling from npm's online registry. PNPM (opens in a new tab) is a tool developed as drop-in replacement for NPM, with one core advantage: it saves disk space by deduplicating shared packages across your machine.
Yarn (opens in a new tab) was developed by FB with a few key benefits over NPM:
- Zero install: yarn stores a copy of dependencies as a zip file that you can commit. While you might wonder if committing dependencies is advisable, it speeds up CI and protects you from remote changes in the registry (in the past, people have removed versions of packages from NPM, breaking builds, and uploaded malicious versions that get updated with little auditability).
- Plug-n-play: Yarn introduced PNP (opens in a new tab) as an opt-in way to save disk space, because zero install means that all packages are in zip files, why not override
require()
to simply reach into those zip files (which are unzipped in memory); this also saves the OS from having to page 100s-100,000s of tiny files into memory. While interesting in theory, and faster in practice, the ecosystem has not adopted it and getting tools like React Native, Webpack, or Electron to work with PNP is difficult. For this reason, I typically disable this. - Workspaces: discussed below.
Because of Facebook's support, and the corresponding features intended for company-wide monorepos, yarn is my prefered package manager.
Workspaces (monorepos)
Packages were great, and soon people started putting many packages in a single repo. If memory serves, I think React was one of the earliest people to do this.
NPM did not have great support for importing local packages (and certainly not for developing multiple local packages simultaneously). Yarn solved this through workspaces (opens in a new tab), and soon this became a common feature across various package managers.
Along with supporting multiple local packages, developers needed a way to coordinate tasks across those packages (the most simple example being you should first build any dependencies of a package before you build the package itself). Two tools that came into existence were TurboRepo (opens in a new tab) and Lerna (opens in a new tab). Because TurboRepo is supported by Vercel, it is my preferred tool.
One edgecase to call out with multiple local packages is Hoisting:
For background, when you import a package in node, node climbs the directory structure checking in node_modules
. For example, if I require package foo
from /some/path/file.js
, node will check the following locations, stopping when it finds a file.
/some/path/node_modules/foo
, and then/some/node_modules/foo
, and then/node_modules/foo
Now, imagine you have a repository with two workspaces, each which depend on react:
repository/
workspace-a/
node_modules/
react/
workspace-b/
node_modules/
react/
If A depends on B, compiling A will result in an artifact that contains two copies of react
(as any code in workspace-a that references react will resolve to one react, and any code in workspace-b that references react will resolve to another react).
One way to solve this in a workspace is to "hoist" react:
repository/
workspace-a/
workspace-b/
node_modules/
react/
Now both workspaces will resolve to the same react.
If you find this confusing, it is, but it's something to be mindful of when working with workspaces.
Scaffolders
If you had to set it all this tooling yourself, it could easily be a week of work. As a result, there are a handful of tools that help you scaffold new projects.
The most common general tool here is Yeoman (opens in a new tab), which lets developers author customer project generators (opens in a new tab). There are also application-specific scaffolders, such as create-next-app (opens in a new tab), create-react-native-app (opens in a new tab), or HTML5 boilerplate (opens in a new tab).