This is a quick blog post to remind me how to do something in future.
I’ve written before about how to download magazines from the Internet Archive and make gifs from their covers. In that post I used the Internet Archive command-line tool to download items from a Collection.
I recently discovered that you can now create your own custom lists of things. Only IA archive staff can create Collections (although you can ask for them I believe) but anyone can make a public or private list of items.
I’ve used this to create a list of Dungeon Magazine issues, as there’s no existing collection.
However I couldn’t find a way to easily download the contents of a list. So here’s a quick recipe for doing that.
Firstly, there’s a JSON endpoint for each public list. Here’s the one for my list of Dungeon magazines:
https://archive.org/services/users/@ldodds/lists/1
Just change the base URL of your list from https://archive.org/details to https://archive.org/services.
From the command-line you can then use curl and jq to create a list of the items in your list:
curl https://archive.org/services/users/@ldodds/lists/1 | jq -r .value.members[].identifier >itemlist
That command pipes the output of curl into jq. Then asks jq to extract the identifier for every member in the list, outputting the raw values. This produces a list of items without quotes.
The list is then saved as a file called itemlist.
The command-line tool has a feature whereby you can give it a file with list of item identifiers, one per line and ask it to download them all. The tool requires the identifiers to not be quoted, which is why we used the raw output from jq in the previous step.
I used this command to grab the PDF files associated with each magazine:
ia download --itemlist=itemlist --glob="*.pdf"
You’ll then need to wait patiently for your download to complete. The IA doesn’t have endless bandwidth. You can donate to them here.
The files for each item will be stored in their own sub-directory, so you can then do whatever you want with them. Like make gifs.