Check links in your Markdown documents

How to never have a wrong link in your README and other Markdown files.

I want to sleep better at night, I want to know, everyone who is reading one of the many README files in my GitHub repositories has correct URLs. I love adding links to examples, blog posts, other repos - and I hate when a link is incorrect. In this blog post I will show how to check URLs from Markdown files.

  1. Install markdown-link-check with
1
npm i -D markdown-link-check
  1. Add NPM script for finding all top-level Markdown files (you can modify find command [1]](https://www.computerhope.com/unix/ufind.htm), 2 to find more files, if needed). Don't forget to escape \ characters.
1
2
3
4
5
{
"scripts": {
"check:markdown": "find *.md -exec npx markdown-link-check {} \\;"
}
}

Notice both external and local links are checked, but not the anchor tags ⚠️.

You can check URLs on Mac and Linux with npm run check:markdown command:

checking the links locally

  1. Call the same script on CI to avoid breaking the links in the future
1
- run: npm run check:markdown

checking the links on CI

We can even find all Markdown files, while excluding node_modules folder with command

1
find . -type f -name '*.md' ! -path './node_modules/*' ! -path './examples/*' -exec npx markdown-link-check --quiet {} \;

Exit code

We have a problem ... the find -exec will feed each file to markdown-link-check and if one has an error, it just continues onto the next Markdown file, swallowing the error.

Instead, let's find all files likes this

1
find . -type f -name '*.md' ! -path './node_modules/*'

This prints each found filename per line like this

1
2
3
4
5
6
7
8
9
$ find . -type f -name '*.md' ! -path './node_modules/*'
./CODE_OF_CONDUCT.md
./README.md
./browsers/node12.0.0-chrome73-ff68/README.md
./browsers/node8.15.1-chrome73/README.md
./browsers/node10.2.1-chrome74/README.md
./browsers/node13.3.0-chrome79-ff70/README.md
./browsers/node12.13.0-chrome78-ff70-brave78/README.md
...

We can feed these lines into xargs program using -L1 argument (one argument per lint).

1
find . -type f -name '*.md' ! -path './node_modules/*' | xargs -L1 npx markdown-link-check --quiet

Now we are talking

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
$ find . -type f -name '*.md' ! -path './node_modules/*' | xargs -L1 npx markdown-link-check --quiet

FILE: ./CODE_OF_CONDUCT.md

3 links checked.

FILE: ./README.md

23 links checked.

FILE: ./browsers/node12.0.0-chrome73-ff68/README.md

1 links checked.
...

FILE: ./base/README.md
[✖] ubuntu16-8.16.2
[✖] /ubuntu18-node12.14.1
[✖] /ubuntu19-node12.14.1

38 links checked.

ERROR: 3 dead links found!
[✖] ubuntu16-8.16.2 → Status: 400
[✖] /ubuntu18-node12.14.1 → Status: 400
[✖] /ubuntu19-node12.14.1 → Status: 400

...

FILE: ./included/3.6.1/README.md

1 links checked.

$ echo $?
1

The search fails if any of the links are bad. Now I can sleep tight.

See example in cypress-docker-images