09-Projecting-Data
import os
%load_ext sql
Connecting to database¶
host = "localhost"
database = "nyc"
user = os.getenv("SQL_USER")
password = os.getenv("SQL_PASSWORD")
connection_string = f"postgresql://{user}:{password}@{host}/{database}"
%sql $connection_string
'Connected: postgres@nyc'
%%sql
SELECT * from nyc_subway_stations LIMIT 5
* postgresql://postgres:***@localhost/nyc 5 rows affected.
| id | geom | objectid | name | alt_name | cross_st | long_name | label | borough | nghbhd | routes | transfers | color | express | closed |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 376 | 010100002026690000371775B5C3CE2141CBD2347771315141 | 1 | Cortlandt St | None | Church St | Cortlandt St (R,W) Manhattan | Cortlandt St (R,W) | Manhattan | None | R,W | R,W | YELLOW | None | None |
| 2 | 010100002026690000CBE327F938CD21415EDBE1572D315141 | 2 | Rector St | None | None | Rector St (1) Manhattan | Rector St (1) | Manhattan | None | 1 | 1 | RED | None | None |
| 1 | 010100002026690000C676635D10CD2141A0ECDB6975305141 | 3 | South Ferry | None | None | South Ferry (1) Manhattan | South Ferry (1) | Manhattan | None | 1 | 1 | RED | None | None |
| 125 | 010100002026690000F4CF3E3654032241B5704681A73C5141 | 4 | 138th St | Grand Concourse | Grand Concourse | 138th St / Grand Concourse (4,5) Bronx | 138th St / Grand Concourse (4,5) | Bronx | None | 4,5 | 4,5 | GREEN | None | None |
| 126 | 01010000202669000084DADF7AED0422410C380E6E3A3D5141 | 5 | 149th St | Grand Concourse | Grand Concourse | 149th St / Grand Concourse (4) Bronx | 149th St / Grand Concourse (4) | Bronx | None | 4 | 2,4,5 | GREEN | express | None |
Check SRID¶
- We’ve already encountered projections when we loaded our
nycdata. (Recall thatpesky SRID 26918).- Sometimes, however, you need to transform and re-project between spatial reference systems.
- PostGIS includes built-in support for changing the projection of data, using the
ST_Transform(geometry, srid)function. - For managing the spatial reference identifiers on geometries, PostGIS provides the
ST_SRID(geometry)andST_SetSRID(geometry, srid)functions.
We can confirm the SRID of our data with the ST_SRID function:
%%sql
SELECT ST_SRID(geom) FROM nyc_streets LIMIT 1;
* postgresql://postgres:***@localhost/nyc 1 rows affected.
| st_srid |
|---|
| 26918 |
And what is definition of “26918”?
- As we saw in loading data section, the definition is contained in the
spatial_ref_systable. In fact, two definitions are there.- The
“well-known text” (WKT)definition is in thesrtextcolumn, - and there is a second definition in “
proj.4” format in theproj4textcolumn.
- The
%%sql
SELECT * FROM spatial_ref_sys
WHERE srid = 26918
* postgresql://postgres:***@localhost/nyc 1 rows affected.
| srid | auth_name | auth_srid | srtext | proj4text |
|---|---|---|---|---|
| 26918 | EPSG | 26918 | PROJCS["NAD83 / UTM zone 18N",GEOGCS["NAD83",DATUM["North_American_Datum_1983",SPHEROID["GRS 1980",6378137,298.257222101,AUTHORITY["EPSG","7019"]],TOWGS84[0,0,0,0,0,0,0],AUTHORITY["EPSG","6269"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AUTHORITY["EPSG","4269"]],PROJECTION["Transverse_Mercator"],PARAMETER["latitude_of_origin",0],PARAMETER["central_meridian",-75],PARAMETER["scale_factor",0.9996],PARAMETER["false_easting",500000],PARAMETER["false_northing",0],UNIT["metre",1,AUTHORITY["EPSG","9001"]],AXIS["Easting",EAST],AXIS["Northing",NORTH],AUTHORITY["EPSG","26918"]] | +proj=utm +zone=18 +datum=NAD83 +units=m +no_defs |
In fact, for the internal PostGIS re-projection calculations, it is the contents of the proj4text column that are used. For our 26918 projection, here is the proj.4 text:
%%sql
SELECT proj4text FROM spatial_ref_sys WHERE srid = 26918
* postgresql://postgres:***@localhost/nyc 1 rows affected.
| proj4text |
|---|
| +proj=utm +zone=18 +datum=NAD83 +units=m +no_defs |
In practice, both the srtext and the proj4text columns are important:
- the
srtextcolumn is used by external programs likeGeoServer,QGIS, andFMEand others; - the
proj4textcolumn is used internally.
Comparing Data¶
Taken together, a coordinate and an SRID define a location on the globe. Without an SRID, a coordinate is just an abstract notion.
- A “Cartesian” coordinate plane is defined as a “flat” coordinate system placed on the surface of Earth.
- Because PostGIS functions work on such a plane, comparison operations require that both geometries be represented in the same
SRID.
If you feed in geometries with differing SRIDs you will just get an error:
%%sql
SELECT ST_Equals(
ST_GeomFromText('POINT(0 0)', 4326),
ST_GeomFromText('POINT(0 0)', 26918)
)
* postgresql://postgres:***@localhost/nyc
--------------------------------------------------------------------------- InternalError_ Traceback (most recent call last) File d:\Anaconda3\envs\mybase\Lib\site-packages\sqlalchemy\engine\base.py:1705, in Connection._execute_context(self, dialect, constructor, statement, parameters, execution_options, *args, **kw) 1704 if not evt_handled: -> 1705 self.dialect.do_execute( 1706 cursor, statement, parameters, context 1707 ) 1709 if self._has_events or self.engine._has_events: File d:\Anaconda3\envs\mybase\Lib\site-packages\sqlalchemy\engine\default.py:681, in DefaultDialect.do_execute(self, cursor, statement, parameters, context) 680 def do_execute(self, cursor, statement, parameters, context=None): --> 681 cursor.execute(statement, parameters) InternalError_: ST_Equals: Operation on mixed SRID geometries (Point, 4326) != (Point, 26918) The above exception was the direct cause of the following exception: InternalError Traceback (most recent call last) Cell In[8], line 1 ----> 1 get_ipython().run_cell_magic('sql', '', "\nSELECT ST_Equals(\n ST_GeomFromText('POINT(0 0)', 4326),\n ST_GeomFromText('POINT(0 0)', 26918)\n )\n") File d:\Anaconda3\envs\mybase\Lib\site-packages\IPython\core\interactiveshell.py:2478, in InteractiveShell.run_cell_magic(self, magic_name, line, cell) 2476 with self.builtin_trap: 2477 args = (magic_arg_s, cell) -> 2478 result = fn(*args, **kwargs) 2480 # The code below prevents the output from being displayed 2481 # when using magics with decodator @output_can_be_silenced 2482 # when the last Python token in the expression is a ';'. 2483 if getattr(fn, magic.MAGIC_OUTPUT_CAN_BE_SILENCED, False): File d:\Anaconda3\envs\mybase\Lib\site-packages\sql\magic.py:95, in SqlMagic.execute(self, line, cell, local_ns) 92 return self._persist_dataframe(parsed['sql'], conn, user_ns) 94 try: ---> 95 result = sql.run.run(conn, parsed['sql'], self, user_ns) 97 if result is not None and not isinstance(result, str) and self.column_local_vars: 98 #Instead of returning values, set variables directly in the 99 #users namespace. Variable names given by column names 101 if self.autopandas: File d:\Anaconda3\envs\mybase\Lib\site-packages\sql\run.py:340, in run(conn, sql, config, user_namespace) 338 else: 339 txt = sqlalchemy.sql.text(statement) --> 340 result = conn.session.execute(txt, user_namespace) 341 _commit(conn=conn, config=config) 342 if result and config.feedback: File d:\Anaconda3\envs\mybase\Lib\site-packages\sqlalchemy\engine\base.py:1200, in Connection.execute(self, statement, *multiparams, **params) 1196 util.raise_( 1197 exc.ObjectNotExecutableError(statement), replace_context=err 1198 ) 1199 else: -> 1200 return meth(self, multiparams, params, _EMPTY_EXECUTION_OPTS) File d:\Anaconda3\envs\mybase\Lib\site-packages\sqlalchemy\sql\elements.py:313, in ClauseElement._execute_on_connection(self, connection, multiparams, params, execution_options) 309 def _execute_on_connection( 310 self, connection, multiparams, params, execution_options 311 ): 312 if self.supports_execution: --> 313 return connection._execute_clauseelement( 314 self, multiparams, params, execution_options 315 ) 316 else: 317 raise exc.ObjectNotExecutableError(self) File d:\Anaconda3\envs\mybase\Lib\site-packages\sqlalchemy\engine\base.py:1389, in Connection._execute_clauseelement(self, elem, multiparams, params, execution_options) 1377 compiled_cache = execution_options.get( 1378 "compiled_cache", self.engine._compiled_cache 1379 ) 1381 compiled_sql, extracted_params, cache_hit = elem._compile_w_cache( 1382 dialect=dialect, 1383 compiled_cache=compiled_cache, (...) 1387 linting=self.dialect.compiler_linting | compiler.WARN_LINTING, 1388 ) -> 1389 ret = self._execute_context( 1390 dialect, 1391 dialect.execution_ctx_cls._init_compiled, 1392 compiled_sql, 1393 distilled_params, 1394 execution_options, 1395 compiled_sql, 1396 distilled_params, 1397 elem, 1398 extracted_params, 1399 cache_hit=cache_hit, 1400 ) 1401 if has_events: 1402 self.dispatch.after_execute( 1403 self, 1404 elem, (...) 1408 ret, 1409 ) File d:\Anaconda3\envs\mybase\Lib\site-packages\sqlalchemy\engine\base.py:1748, in Connection._execute_context(self, dialect, constructor, statement, parameters, execution_options, *args, **kw) 1745 branched.close() 1747 except BaseException as e: -> 1748 self._handle_dbapi_exception( 1749 e, statement, parameters, cursor, context 1750 ) 1752 return result File d:\Anaconda3\envs\mybase\Lib\site-packages\sqlalchemy\engine\base.py:1929, in Connection._handle_dbapi_exception(self, e, statement, parameters, cursor, context) 1927 util.raise_(newraise, with_traceback=exc_info[2], from_=e) 1928 elif should_wrap: -> 1929 util.raise_( 1930 sqlalchemy_exception, with_traceback=exc_info[2], from_=e 1931 ) 1932 else: 1933 util.raise_(exc_info[1], with_traceback=exc_info[2]) File d:\Anaconda3\envs\mybase\Lib\site-packages\sqlalchemy\util\compat.py:211, in raise_(***failed resolving arguments***) 208 exception.__cause__ = replace_context 210 try: --> 211 raise exception 212 finally: 213 # credit to 214 # https://cosmicpercolator.com/2016/01/13/exception-leaks-in-python-2-and-3/ 215 # as the __traceback__ object creates a cycle 216 del exception, replace_context, from_, with_traceback File d:\Anaconda3\envs\mybase\Lib\site-packages\sqlalchemy\engine\base.py:1705, in Connection._execute_context(self, dialect, constructor, statement, parameters, execution_options, *args, **kw) 1703 break 1704 if not evt_handled: -> 1705 self.dialect.do_execute( 1706 cursor, statement, parameters, context 1707 ) 1709 if self._has_events or self.engine._has_events: 1710 self.dispatch.after_cursor_execute( 1711 self, 1712 cursor, (...) 1716 context.executemany, 1717 ) File d:\Anaconda3\envs\mybase\Lib\site-packages\sqlalchemy\engine\default.py:681, in DefaultDialect.do_execute(self, cursor, statement, parameters, context) 680 def do_execute(self, cursor, statement, parameters, context=None): --> 681 cursor.execute(statement, parameters) InternalError: (psycopg2.errors.InternalError_) ST_Equals: Operation on mixed SRID geometries (Point, 4326) != (Point, 26918) [SQL: SELECT ST_Equals( ST_GeomFromText('POINT(0 0)', 4326), ST_GeomFromText('POINT(0 0)', 26918) )] (Background on this error at: http://sqlalche.me/e/14/2j85)
Be careful of getting too happy with using ST_Transform for on-the-fly conversion.
- Spatial indexes are built using
SRIDof the stored geometries.- If comparison are done in a different
SRID, spatial indexes are (often) not used. It is best practice to choose oneSRIDfor all the tables in your database. - Only use the transformation function when you are reading or writing data to external applications.
- If comparison are done in a different
Transforming Data¶
If we return to our proj4 definition for SRID 26918, we can see that our working projection is UTM (Universal Transverse Mercator) of zone 18, with meters as the unit of measurement.
+proj=utm +zone=18 +ellps=GRS80 +datum=NAD83 +units=m +no_defs
Let’s convert some data from our working projection to geographic coordinates – also known as “longitude/latitude”.
To convert data from one SRID to another
- you must first verify that your geometry has a valid
SRID. - Since we have already confirmed a valid
SRID, we next need theSRIDof the projection to transform into. In other words, what is theSRIDof geographic coordinates?
The most common SRID for geographic coordinates is 4326, which corresponds to “longitude/latitude on the WGS84 spheroid”. You can see the definition at the spatialreference.org site:
You can also pull the definitions from the spatial_ref_sys table:
%%sql
SELECT srtext FROM spatial_ref_sys
WHERE srid = 4326;
* postgresql://postgres:***@localhost/nyc 1 rows affected.
| srtext |
|---|
| GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AUTHORITY["EPSG","4326"]] |
Let’s convert the coordinates of the ‘Broad St’ subway station into geographics:
%%sql
SELECT ST_AsText(geom)
FROM nyc_subway_stations
WHERE name = 'Broad St';
* postgresql://postgres:***@localhost/nyc 1 rows affected.
| st_astext |
|---|
| POINT(583571.9059213118 4506714.341192182) |
%%sql
SELECT ST_AsText(ST_Transform(geom,4326))
FROM nyc_subway_stations
WHERE name = 'Broad St';
* postgresql://postgres:***@localhost/nyc 1 rows affected.
| st_astext |
|---|
| POINT(-74.01067146887341 40.70710481558761) |
If you load data or create a new geometry without specifying an SRID, the SRID value will be 0.
- Recall in geometries, that when we created our geometries table we didn’t specify an
SRID. - If we query our database, we should expect all the
nyc_tables to have anSRIDof26918 - while the geometries table defaulted to an
SRIDof0.
To view a table’s SRID assignment, query the database’s geometry_columns table.
%%sql
SELECT f_table_name AS name, srid
FROM geometry_columns;
* postgresql://postgres:***@localhost/nyc 6 rows affected.
| name | srid |
|---|---|
| nyc_subway_stations | 26918 |
| nyc_neighborhoods | 26918 |
| nyc_census_blocks | 26918 |
| nyc_homicides | 26918 |
| nyc_streets | 26918 |
| geometries | 0 |
However, if you know what the SRID of the coordinates is supposed to be, you can set it post-facto, using ST_SetSRID on the geometry. Then you will be able to transform the geometry into other systems.
%%sql
SELECT ST_AsText(
ST_Transform(
ST_SetSRID(geom,26918),
4326)
)
FROM geometries;
* postgresql://postgres:***@localhost/nyc 5 rows affected.
| st_astext |
|---|
| POINT(-79.48874388438705 0) |
| LINESTRING(-79.48874388438705 0,-79.48873492539037 0.000009019375921,-79.48872596639353 0.000009019376033,-79.48872596639369 0.000018038752065) |
| POLYGON((-79.48874388438705 0,-79.4887349253903 0,-79.48873492539037 0.000009019375921,-79.4887438843871 0.00000901937581,-79.48874388438705 0)) |
| POLYGON((-79.48874388438705 0,-79.48865429441472 0,-79.48865429442024 0.000090193769243,-79.48874388439259 0.000090193758097,-79.48874388438705 0),(-79.48873492539037 0.000009019375921,-79.48873492539053 0.000018038751842,-79.48872596639369 0.000018038752065,-79.48872596639353 0.000009019376033,-79.48873492539037 0.000009019375921)) |
| GEOMETRYCOLLECTION(POINT(-79.48872596639346 0),POLYGON((-79.48874388438705 0,-79.4887349253903 0,-79.48873492539037 0.000009019375921,-79.4887438843871 0.00000901937581,-79.48874388438705 0))) |
Function List¶
ST_AsText: Returns the Well-Known Text (WKT) representation of the geometry/geography without SRID metadata.ST_SetSRID(geometry, srid): Sets the SRID on a geometry to a particular integer value.ST_SRID(geometry): Returns the spatial reference identifier for the ST_Geometry as defined in spatial_ref_sys table.ST_Transform(geometry, srid): Returns a new geometry with its coordinates transformed to the SRID referenced by the integer parameter.