Skip to content
R for the Rest of Us Logo

Converting Addresses to Coordinates (01_09)

This lesson is locked

Get access to all lessons in this course.

Transcript

Click on the transcript to go to that point in the video. Please note that transcripts are auto generated and may contain minor inaccuracies.

Your Turn

Open the 01_09 project. In the your-turn.R file:

Convert the addresses in the “International Addresses” worksheet into coordinates and visualise them.

  • Import the “International Addresses” worksheet

  • Re-use the mutate(across(...)) code to remove NA values

  • Combine together address_line_1, address_line_2, address_line_3 into full_address

  • Use geocode() to convert the addresses into coordinates

  • Visualise the locations with mapview

Learn More

This video introduces two packages for forward geocoding, which I’d like to summarise again here:

ggmap uses the Google Maps Geocoding API which is extremely well regarded and considered an industry standard. However, to register for the API you must provide your credit card details to Google. As of late 2020 users are provided with $300 free credit per month, however this could change in the future. Instructions for setting up your Google Cloud Platform account can be found on the ggmap package website.

tidygeocoder provides access to four different geocoding API via the method argument:

  • “census” is the default and connects to the free to use US Census Geocoder but is only useful for addresses in the US

  • “geocodio” uses the commercial geocoding service Geocodio that covers only the US and Canada. A free tier is available and you do not need to provide your billing details for this type of account.

  • “iq” uses the international commercial geocoding service LocationIQ. A free tier is available and you do not need to provide your billing details for this type of account. 

It’s important to remember that there is (unfortunately) not an international standard for formatting street addresses. In general, the more specific a query the more likely an API will return an accurate set of coordinates.

Have any questions? Put them below and we will help you out!

You need to be signed-in to comment on this post. Login.

Kristina Bratkova

Kristina Bratkova

March 2, 2022

Hello, I was wondering whether the order of arguments in geocode() function matters (i.e. whether we specify street= street_address, city = city, postalcode = postal_code, in this order or different one)? I was also wondering whether adding more columns to geocode() that have many NAs like state=region may prevent from finding the addresses (for example, iq method found all 11 addresses each time but without the method specified, geocode found only 4 addresses with state=region argument (plus all other column as in the example) but excluding this argument geocode found 9 addresses; this question may be related to the first one perhaps?

David Keyes

David Keyes

March 2, 2022

Good questions! The order of arguments should not matter. On the second question, I'm not sure, but Charlie will chime in here soon with more info.

Charlie Hadley

Charlie Hadley

March 3, 2022

Hi Kristina! I can definitely understand asking these questions! In the R language functions don't care the order in which named arguments are given, which means that you can supply the street etc arguments to geocode() in any order. When you don't supply the method argument it defaults to "osm" which uses the free Nominatim which I've also found to be very sensitive to providing NA values, and is in general very fussy. I'd recommend using method="iq" by default as it's much more reliable. If I understand you correctly when you did specify this method the addresses were geolocated with or without the NA values?

Kristina Bratkova

Kristina Bratkova

March 4, 2022

Hi David and Charlie,

many thanks for your replies. Charlie, yes correct, the 'osm' (default) method found 9 addresses out of 11 but only 4 with state=region argument (as it has fussy NAs as you explain). 'iq' method found all 11 in both instances.

Great course so far!

Are there regional differences or any other settings that would influence which values are returned for the fragmented address information we're starting with? Following this code exactly never found coordinates for the American University in Cairo, for example, and the map I've produced seems to think the Harbin University of Science and Technology is in Italy!

In the instructions for the "Your Turn" section, it mentions combining three lines of the address into full_address, and using mutate(across(...)) to remove NA values, but these steps don't seem to be mentioned in the solutions video or the solutions code chunk. Did I miss something there?

Thanks!! The course is awesome, I'm having a lot of fun :)

Charlie Hadley

Charlie Hadley

February 13, 2023

Ah - that's a bit of a confusing description! It is possible to combine columns into one and give that all to the address argument of geocode() but it's more reliable to use the named arguments as shown in the solutions video. Thanks for bringing that to my attention!

There are regional differences in how the geocoding services will work! They try and be clever based off your own GeoIP, though I've no idea why they'd place Habin Uni in Italy. This is a bit like Google Maps leading people down roads that go to cliff edges.

Glad you're enjoying the course!

ok, thanks Charlie! good to know