notes by alifeee profile picture tagged jq (4) rss

return to notes / blog / website / weeknotes / linktree

here I may post some short, text-only notes, mostly about programming. source code.

tags: all (46) scripting (15) linux (5) bash (4) geojson (4) jq (4) obsidian (4) android (3) github (3) html (3) ............ see all (+67)

taking a small bunch of census data from FindMyPast # prev single next top

tags: jq, scripting, web-scraping, census, data • 507 'words', 152 secs @ 200wpm

The 1939 Register was an basically-census taken in 1939. On the National Archives Page, it says that it is entirely available online.

However, further down, it lists how to access it, which says:

You can search for and view open records on our partner site Findmypast.co.uk (charges apply). A version of the 1939 Register is also available at Ancestry.co.uk (charges apply), and transcriptions without images are on MyHeritage.com (charges apply). It is free to search for these records, but there is a charge to view full transcriptions and download images of documents. Please note that you can view these records online free of charge in the reading rooms at The National Archives in Kew.

So… charges apply.

Anyway, for a while in April 2025 (until May 8th), FindMyPast is giving free access to the 1939 data.

Of course, family history is hard, and what's much easier is "who lived in my house in 1939". For that you can use:

I created an account with a bogus email address (you're not collecting THIS guy's data) and took a look around at some houses.

Then, I figured I could export my entire street, so I did.

The code and more context is in a GitHub Repository, but in brief, I:

Now it looks like:

AddressStreet,Address,Inhabited,LatLon,FirstName,LastName,BirthDate,ApproxAge,OccupationText,Gender,MaritalStatus,Relationship,Schedule,ScheduleSubNumber,Id
Khartoum Road,"1 Khartoum Road, Sheffield",Y,"53.3701,-1.4943",Constance A,Latch,31 Aug 1904,35,Manageress Restaurant & Canteen,Female,Married,Unknown,172,3,TNA/R39/3506/3506E/003/17
Khartoum Road,"4 Khartoum Road, Sheffield",Y,"53.3701,-1.4943",Catherine,Power,? Feb 1897,42,Music Hall Artists,Female,Married,Unknown,171,8,TNA/R39/3506/3506D/015/37
Khartoum Road,"4 Khartoum Road, Sheffield",Y,"53.3701,-1.4943",Charles F R,Kirby,? Nov 1886,53,Newsagent Canvasser,Male,Married,Head,172,1,TNA/R39/3506/3506D/015/39
Khartoum Road,"4 Khartoum Road, Sheffield",Y,"53.3701,-1.4943",Constance A,Latch,31 Aug 1912,27,Manageress Restairant & Cante,Female,Married,Unknown,172,3,TNA/R39/3506/3506D/015/41

Neat!

Some of my favourite jobs in the sheet of streets I collected are:

back to top

combining geojson files with jq # prev single next top

tags: geojson, jq, scripting • 520 'words', 156 secs @ 200wpm

I'm writing a blog about hitchhiking, which involves a load os .geojson files, which look a bit like this:

The .geojson files are generated from .gpx traces that I exported from OSRM's (Open Source Routing Machine) demo (which, at time of writing, seems to be offline, but I believe it's on https://map.project-osrm.org/), one of the routing engines on OpenStreetMap.

I put in a start and end point, exported the .gpx trace, and then converted it to .geojson with, e.g., ogr2ogr "2.1 Tamworth -> Tibshelf Northbound.geojson" "2.1 Tamworth -> Tibshelf Northbound.gpx" tracks, where ogr2ogr is a command-line tool from sudo apt install gdal-bin which converts geographic data between many formats (I like it a lot, it feels nicer than searching the web for "errr, kml to gpx converter?"). I also then semi-manually added some properties (see how).

{
  "type": "FeatureCollection",
  "features": [
    {
      "type": "Feature",
      "properties": {
        "label": "2.1 Tamworth -> Tibshelf Northbound",
        "trip": "2"
      },
      "geometry": {
        "type": "MultiLineString",
        "coordinates": [
          [
            [-1.64045, 52.60606]
            [-1.64067, 52.6058],
            [-1.64069, 52.60579],
            ...
          ]
        ]
      }
    }
  ]
}

I then had a load of files that looked a bit like

$ tree -f geojson/
geojson
├── geojson/1.1 Tamworth -> Woodall Northbound.geojson
├── geojson/1.2 Woodall Northbound -> Hull.geojson
├── geojson/2.1 Tamworth -> Tibshelf Northbound.geojson
├── geojson/2.2 Tibshelf Northbound -> Leeds.geojson
├── geojson/3.1 Frankley Northbound -> Hilton Northbound.geojson
├── geojson/3.2 Hilton Northbound -> Keele Northbound.geojson
└── geojson/3.3 Keele Northbound -> Liverpool.geojson

Originally, I was combining them into one .geojson file using https://github.com/mapbox/geojson-merge, which as a binary to merge .geojson files, but I decided to use jq because I wanted to do something a bit more complex, which was to create a structure like

FeatureCollection
  Features:
    FeatureCollection
      Features (1.1 Tamworth -> Woodall Northbound, 1.2 Woodall Northbound -> Hull)
    FeatureCollection
      Features (2.1 Tamworth -> Tibshelf Northbound, 2.2 Tibshelf Northbound -> Leeds)
    FeatureCollection
      Features (3.1 Frankley Northbound -> Hilton Northbound, 3.2 Hilton Northbound -> Keele Northbound, 3.3 Keele Northbound -> Liverpool)

I spent a while making a quite-complicated jq query, using variables (an "advanced feature"!) and a reduce statement, but when I completed it, I found out that the above structure is not valid .geojson, so I went back to just having:

FeatureCollection
  Features (1.1 Tamworth -> Woodall Northbound, 1.2 Woodall Northbound -> Hull, 2.1 Tamworth -> Tibshelf Northbound, 2.2 Tibshelf Northbound -> Leeds, 3.1 Frankley Northbound -> Hilton Northbound, 3.2 Hilton Northbound -> Keele Northbound, 3.3 Keele Northbound -> Liverpool)

...which is... a lot simpler to make.

A query which combines the files above is (the sort exists to sort the files so they are in numerical order downwards in the resulting .geojson):

while read file; do cat "${file}"; done <<< $(find geojson/ -type f | sort -t / -k 2 -n) | jq --slurp '{
    "type": "FeatureCollection",
    "name": "hitchhikes",
    "features": ([.[] | .features[0]])
}' > hitching.geojson

While geojson-merge was cool, it feels nice to have a more "raw" command to do what I want.

back to top

turning a list of geojson points into a list of lines between the points # prev single next top

tags: geojson, scripting, jq • 454 'words', 136 secs @ 200wpm

I often turn lists of coordinates into a geojson file, so they can be easily shared and viewed on a map. See several examples on https://alifeee.co.uk/maps/.

One thing I wanted to do recently was turn a list of points ("places I've been") into a list of straight lines connecting them, to show routes on a map. I made a script using jq to do this, using the same data from my note about making a geojson file from a CSV.

Effectively, I want to turn these coordinates...

latitude,longitude,description,best part
53.74402,-0.34753,Hull,smallest window in the UK
54.779764,-1.581559,Durham,great cathedral
52.47771,-1.89930,Birmingham,best board game café
53.37827,-1.46230,Sheffield,5 rivers!!!

...into...

from latitude,from longitude,to latitude,to longitude,route
53.74402,-0.34753,54.779764,-1.581559,Hull --> Durham
54.779764,-1.581559,52.47771,-1.89930,Durham --> Birmingham
52.47771,-1.89930,53.37827,-1.46230,Birmingham --> Sheffield

...but in a .geojson format, so I can view them on a map. Since this turns N items into N - 1 items, it sounds like it's time for a reduce (I like using map, filter, and reduce a lot. They're very satisfying. Some would say I should get [more] into Functional Programming).

So, the jq script to "combine" coordinates is: (hopefully you can vaguely see which bits of it do what)

# one-liner
cat places.geojson | jq '.features |= ( reduce .[] as $item ([]; .[-1] as $last | ( if $last == null then [$item | .geometry.coordinates |= [[], .]] else [.[], ($item | (.geometry.coordinates |= [($last | .geometry.coordinates[1]), .]) | (.properties.route=(($last | .properties.description) + " --> " + .properties.description)) | .geometry.type="LineString")] end)) | .[1:])'

# expanded
cat places.geojson | jq '
.features |= (
  reduce .[] as $item (
    [];
    .[-1] as $last
    | (
      if $last == null then
        [$item | .geometry.coordinates |= [[], .]]
      else
        [
          .[],
          (
            $item
            | (.geometry.coordinates |=[
                  ($last | .geometry.coordinates[1]),
                  .
                ]
              )
            | (.properties.route=(
                  ($last | .properties.description)
                  + " --> "
                  + .properties.description
                )
              )
            | .geometry.type="LineString"
          )
        ]
      end
      )
    )
  | .[1:]
)
'

This turns a .geojson file like...

{
  "type": "FeatureCollection",
  "features": [
    {
      "type": "Feature",
      "properties": {
        "description": "Hull",
        "best part": "smallest window in the UK"
      },
      "geometry": {
        "type": "Point",
        "coordinates": [
          -0.34753,
          53.74402
        ]
      }
    },
    {
      "type": "Feature",
      "properties": {
        "description": "Durham",
        "best part": "great cathedral"
      },
      "geometry": {
        "type": "Point",
        "coordinates": [
          -1.581559,
          54.779764
        ]
      }
    },
    ...
  ]
}

...into a file like...

{
  "type": "FeatureCollection",
  "features": [
    {
      "type": "Feature",
      "properties": {
        "description": "Durham",
        "best part": "great cathedral",
        "route": "Hull --> Durham"
      },
      "geometry": {
        "type": "LineString",
        "coordinates": [
          [
            -0.34753,
            53.74402
          ],
          [
            -1.581559,
            54.779764
          ]
        ]
      }
    },
    ...
  ]
}

As with the previous post, making this script took a lot of reading man jq (very well-written) in my terminal, and a lot of searching "how to do X in jq".

back to top

making a geojson file from a csv # prev single next top

tags: geojson, scripting, jq • 508 'words', 152 secs @ 200wpm

I like maps. I make a lot of maps. See some on https://alifeee.co.uk/maps/.

I've gotten into a habit with map-making: my favourite format is geojson, and I've found some tools to help me screw around with it, namely https://github.com/pvernier/csv2geojson to create a .geojson file from a .csv, and https://geojson.io/ to quickly and nicely view the geojson. geojson.io can also export as KML (used to import into Google Maps).

In attempting to turn a .geojson file from a list of "Point"s to a list of "LineString"s using jq, I figured I could also generate the .geojson file myself using jq, instead of using the csv2geojson Go program above. This is my (successful) attempt:

First, create a CSV file places.csv with coordinates (latitude and longitude columns) and other information. There are many ways to find coordinates; one is to use https://www.openstreetmap.org/, zoom into somewhere, and copy them from the URL. For example, some places I have lived:

latitude,longitude,description,best part
53.74402,-0.34753,Hull,smallest window in the UK
54.779764,-1.581559,Durham,great cathedral
52.47771,-1.89930,Birmingham,best board game café
53.37827,-1.46230,Sheffield,5 rivers!!!

Then, I spent a while (maybe an hour) crafting this jq script to turn that (or a similar CSV) into a geojson file. Perhaps you can vaguely see which parts of it do what.

# one line
cat places.csv | jq -Rn '(input|split(",")) as $headers | ($headers|[index("longitude"), index("latitude")]) as $indices | {"type": "FeatureCollection", "features": [inputs|split(",") | {"type": "Feature", "properties": ([($headers), .] | transpose | map( { "key": .[0], "value": .[1] } ) | from_entries | del(.latitude, .longitude)), "geometry": {"type": "Point", "coordinates": [(.[$indices[0]]|tonumber), (.[$indices[1]]|tonumber)]}}]}'

# over multiple lines
cat places.csv | jq -Rn '
(input|split(",")) as $headers
| ($headers|[index("longitude"), index("latitude")]) as $indices
| {
    "type": "FeatureCollection",
    "features": [
        inputs|split(",")
        | {
            "type": "Feature",
            "properties": ([($headers), .] | transpose | map( { "key": .[0], "value": .[1] } ) | from_entries | del(.latitude, .longitude)),
            "geometry": {"type": "Point", "coordinates": [(.[$indices[0]]|tonumber), (.[$indices[1]]|tonumber)]}
          }
    ]
  }
'

which makes a file like:

{
  "type": "FeatureCollection",
  "features": [
    {
      "type": "Feature",
      "properties": {
        "description": "Hull",
        "best part": "smallest window in the UK"
      },
      "geometry": {
        "type": "Point",
        "coordinates": [
          -0.34753,
          53.74402
        ]
      }
    },
    {
      "type": "Feature",
      "properties": {
        "description": "Durham",
        "best part": "great cathedral"
      },
      "geometry": {
        "type": "Point",
        "coordinates": [
          -1.581559,
          54.779764
        ]
      }
    },
    ...
  ]
}

...which I can then export into https://geojson.io/, or turn into another format with gdal (e.g., with ogr2ogr places.gpx places.geojson).

It's very satisfying for me to use jq. I will definitely be re-using this script in the future to make .geojson files, but as well re-using some of the jq techniques I learnt while making it.

Mostly for help I used man jq in my terminal, the .geojson proposal for the .geojson structure, and a lot of searching the web for "how to do X using jq".

back to top