I need a definition of Swain's Lane that I can compare to. I will start by drawing a route with my favourite GPS software. A route is just a sequence of route points. I can then export that as a GPX file.
<?xml version="1.0" encoding="UTF-8"?>
<gpx xmlns="http://www.topografix.com/GPX/1/1" version="1.1" creator="ViewRanger - //www.viewranger.com" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.topografix.com/GPX/1/1 http://www.topografix.com/GPX/1/1/gpx.xsd">
<rte>
<name><![CDATA[Swain's World]]></name>
<rtept lat="51.569613039632" lon="-0.14770468632509"></rtept>
<rtept lat="51.569407978151" lon="-0.14832964102552"></rtept>
<rtept lat="51.567090552402" lon="-0.14674177328872"></rtept>
<rtept lat="51.567080548869" lon="-0.14592101733016"></rtept>
<rtept lat="51.569618041121" lon="-0.14773419062425"></rtept>
</rte>
</gpx>
Geometries Table
I will load the GPX route into a table much as I did with the track files.
drop table my_geometries purge;
createg table my_geometries
(geom_id NUMBER NOT NULL
,descr VARCHAR2(64)
,gpx XMLTYPE
,geom mdsys.sdo_geometry
,geom_27700 mdsys.sdo_geometry
,mbr mdsys.sdo_geometry
,constraint my_geometries_pk PRIMARY KEY (geom_id)
)
XMLTYPE COLUMN gpx STORE AS SECUREFILE BINARY XML (CACHE DISABLE STORAGE IN ROW)
/
delete from my_geometries where geom_id = 2; INSERT INTO my_geometries (geom_id, descr, gpx) VALUES (2,'Swains World Route', XMLTYPE(strava_pkg.getClobDocument('STRAVA','swainsworldroute.gpx'))); UPDATE my_geometries SET geom = mdsys.sdo_geometry(2002,4326,null,mdsys.sdo_elem_info_array(1,2,1), cast(multiset( select CASE n.rn WHEN 1 THEN pt.lng WHEN 2 THEN pt.lat END ord from ( SELECT /*+MATERIALIZE*/ rownum rn , TO_NUMBER(EXTRACTVALUE(VALUE(t), 'rtept/@lon')) as lng , TO_NUMBER(EXTRACTVALUE(VALUE(t), 'rtept/@lat')) as lat FROM my_geometries g, TABLE(XMLSEQUENCE(extract(g.gpx,'/gpx/rte/rtept','xmlns="http://www.topografix.com/GPX/1/1"'))) t where g.geom_id = 2 ) pt, (select 1 rn from dual union all select 2 from dual) n order by pt.rn, n.rn ) AS mdsys.sdo_ordinate_array)) WHERE gpx IS NOT NULL AND geom IS NULL / UPDATE my_geometries SET mbr = sdo_geom.sdo_mbr(geom) , geom_27700 = sdo_cs.transform(geom,27700) / Commit; Set pages 99 lines 180 Select geom_id, descr, gpx, geom from my_geometries where geom_id = 2; GEOM_ID DESCR ---------- ---------------------------------------------------------------- GPX ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ GEOM(SDO_GTYPE, SDO_SRID, SDO_POINT(X, Y, Z), SDO_ELEM_INFO, SDO_ORDINATES) ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ 2 Swains World Route <?xml version="1.0" encoding="US-ASCII"?> <gpx xmlns="http://www.topografix.com/GPX/1/1" version="1.1" creator="ViewRanger - //www.viewranger.com" xml SDO_GEOMETRY(2002, 4326, NULL, SDO_ELEM_INFO_ARRAY(1, 2, 1), SDO_ORDINATE_ARRAY(-.14651114, 51.5670769, -.14649237, 51.567298, -.1465782, 51.567563, -.14680618, 51.5680165, -.14697 516, 51.5682533, -.14754379, 51.5688701, -.14807219, 51.5694887))
I am going to build spatial indexes on the geometry columns, so I need to define the upper and lower bound values on the coordinates.
delete from user_sdo_geom_metadata where table_name = 'ACTIVITIES'; insert into user_sdo_geom_metadata (table_name,column_name,diminfo,srid) values ( 'ACTIVITIES' , 'GEOM_27700', sdo_dim_array( sdo_dim_element('Easting',-1000000,1500000,0.05), sdo_dim_element('Northing', -500000,2000000,0.05)), 27700); insert into user_sdo_geom_metadata (table_name,column_name,diminfo,srid) values ( 'ACTIVITIES' , 'GEOM', sdo_dim_array( sdo_dim_element('Longitude',-180,180,0.05), sdo_dim_element('Latgitude',-90,90,0.05)), 4326); commit; CREATE INDEX activities_geom ON ACTIVITIES (geom) INDEXTYPE IS MDSYS.SPATIAL_INDEX_v2; CREATE INDEX activities_geom_27700 ON ACTIVITIES (geom_27700) INDEXTYPE IS MDSYS.SPATIAL_INDEX_v2;
Compare Geometries
Now I can compare my Swain's Lane geometry to my activity geometries. Let's start by looking for rides in December 2020 that went up Swain's Lane
Column activity_id heading 'Activity|ID' Column activity_name format a30 Column geom_relate heading 'geom|relate' format a6 With a as ( SELECT a.activity_id, a.activity_date, a.activity_name , SDO_GEOM.RELATE(a.geom,'anyinteract',g.geom,25) geom_relate FROM activities a , my_geometries g WHERE a.activity_type = 'Ride' --And a.activity_id IN(4468006769) And a.activity_date >= TO_DATE('01122020','DDMMYYYY') and g.geom_id = 2 /*Swains World Route*/ ) Select * From a Where geom_relate = 'TRUE' Order by activity_date /
Where there is a relation between the two geometries then I have a hit.
Activity geom ID ACTIVITY_DATE ACTIVITY_NAME relate ---------- ------------------- ------------------------------ ------ 4419821750 08:44:45 02.12.2020 Loop TRUE 4428307816 10:49:25 04.12.2020 Loop TRUE 4431920358 09:41:13 05.12.2020 Loop TRUE … 4528825613 09:39:38 28.12.2020 Loop TRUE 4534027888 11:29:45 29.12.2020 Loop TRUE 4538488655 09:57:55 30.12.2020 Loop TRUE 25 rows selected.
Updated 24.11.2021: But I also made a fundamental error here. SDO_GEOM.RELATE is a spatial function and does not use the spatial index that I created.
---------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads |
---------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 862 |00:00:29.88 | 118K| 20 |
| 1 | SORT ORDER BY | | 1 | 29 | 862 |00:00:29.88 | 118K| 20 |
| 2 | NESTED LOOPS | | 1 | 29 | 862 |00:00:23.49 | 93227 | 20 |
| 3 | TABLE ACCESS BY INDEX ROWID| MY_GEOMETRIES | 1 | 1 | 1 |00:00:00.01 | 2 | 0 |
|* 4 | INDEX UNIQUE SCAN | MY_GEOMETRIES_PK | 1 | 1 | 1 |00:00:00.01 | 1 | 0 |
|* 5 | TABLE ACCESS FULL | ACTIVITIES | 1 | 29 | 862 |00:00:23.49 | 93225 | 20 |
---------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - access("G"."GEOM_ID"=2)
5 - filter(("A"."ACTIVITY_TYPE"='Ride' AND "SDO_GEOM"."RELATE"("A"."GEOM",'anyinteract',"G"."GEOM",25)='TRUE'))
Instead, I should use the SDO_ANYINTERACT() spatial operator.
SELECT a.activity_id, a.activity_date, a.activity_name
, a.distance_km
FROM activities a
, my_geometries g
WHERE a.activity_type = 'Ride'
and g.geom_id = 2 /*Swains World Route*/
AND SDO_ANYINTERACT(a.geom, g.geom) = 'TRUE'
Order by activity_date
And now it uses the spatial index on ACTIVITIES.
-------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads |
-------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 824 |00:00:07.04 | 34602 | 14 |
| 1 | SORT ORDER BY | | 1 | 1 | 824 |00:00:07.04 | 34602 | 14 |
| 2 | NESTED LOOPS | | 1 | 1 | 824 |00:00:07.04 | 34602 | 14 |
| 3 | TABLE ACCESS BY INDEX ROWID | MY_GEOMETRIES | 1 | 1 | 1 |00:00:00.01 | 2 | 0 |
|* 4 | INDEX UNIQUE SCAN | MY_GEOMETRIES_PK | 1 | 1 | 1 |00:00:00.01 | 1 | 0 |
|* 5 | TABLE ACCESS BY INDEX ROWID | ACTIVITIES | 1 | 1 | 824 |00:00:07.04 | 34600 | 14 |
|* 6 | DOMAIN INDEX (SEL: 0.010000 %)| ACTIVITIES_GEOM | 1 | | 824 |00:00:07.03 | 33224 | 14 |
-------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - access("G"."GEOM_ID"=2)
5 - filter("A"."ACTIVITY_TYPE"='Ride')
6 - access("MDSYS"."SDO_ANYINTERACT"("A"."GEOM","G"."GEOM")='TRUE')
Analyse Individual Efforts
Now I want to analyse each of my trips up Swain's Lane on a particular day. I am going to work with the GPX rather than the spatial geometry because I am interested also in time, elevation and heart rate data that is not stored in the spatial geometry.
Also, you can't use analytic functions on spatial geometries.
with x as ( SELECT activity_id , TO_DATE(EXTRACTVALUE(VALUE(t), 'trkpt/time'),'YYYY-MM-DD"T"HH24:MI:SS"Z"') time , TO_NUMBER(EXTRACTVALUE(VALUE(t), 'trkpt/@lat')) lat , TO_NUMBER(EXTRACTVALUE(VALUE(t), 'trkpt/@lon')) lng FROM activities a, TABLE(XMLSEQUENCE(extract(a.gpx,'/gpx/trk/trkseg/trkpt','xmlns="http://www.topografix.com/GPX/1/1"'))) t WHERE a.activity_id IN(4468006769) ), y as ( select x.*, strava_pkg.make_point(lng,lat) loc from x ) select lag(loc,1) over (partition by activity_id order by time) last_loc from y / select lag(loc,1) over (partition by activity_id order by time) last_loc * ERROR at line 13: ORA-22901: cannot compare VARRAY or LOB attributes of an object type
Instead, I will have to apply analytic functions to the values extracted from the GPX and then create a spatial point. Thus I will be able to calculate the length of each individual trip by aggregating the distance between each pair of points.
The following query splits out each trip up Swain's Lane in a particular activity and shows the distance, duration, and metrics about elevation, gradient, and heart rate.
alter session set statistics_level=ALL;
alter session set nls_date_Format = 'hh24:mi:ss dd.mm.yyyy';
break on activity_id skip 1
compute sum of sum_dist on activity_id
compute sum of num_pt on activity_id
compute sum of sum_secs on activity_id
Set lines 180 pages 50 timi on
Column activity_id heading 'Activity|ID'
Column activity_name format a15
column time format a20
column lat format 999.99999999
column lng format 999.99999999
column ele format 9999.9
column hr format 999
column sdo_relate format a10
column num_pts heading 'Num|Pts' format 99999
column sum_dist heading 'Dist.|(km)' format 999.999
column sum_secs heading 'Secs' format 9999
column avg_speed heading 'Avg|Speed|(kmph)' format 99.9
column ele_gain heading 'Ele|Gain|(m)' format 9999.9
column ele_loss heading 'Ele|Loss|(m)' format 9999.9
column avg_grade heading 'Avg|Grade|%' format 99.9
column min_ele heading 'Min|Ele|(m)' format 999.9
column max_ele heading 'Max|Ele|(m)' format 999.9
column avg_hr heading 'Avg|HR' format 999
column max_hr heading 'Max|HR' format 999
WITH geo as ( /*route geometry to compare to*/
select /*MATERIALIZE*/ g.*, 25 tol
, sdo_geom.sdo_length(geom, unit=>'unit=m') geom_length
from my_geometries g
where geom_id = 2 /*Swains World Route*/
), a as ( /*extract all points in activity*/
SELECT a.activity_id, g.geom g_geom, g.tol, g.geom_length
, TO_DATE(EXTRACTVALUE(VALUE(t), 'trkpt/time'),'YYYY-MM-DD"T"HH24:MI:SS"Z"') time
, TO_NUMBER(EXTRACTVALUE(VALUE(t), 'trkpt/@lat')) lat
, TO_NUMBER(EXTRACTVALUE(VALUE(t), 'trkpt/@lon')) lng
, TO_NUMBER(EXTRACTVALUE(VALUE(t), 'trkpt/ele')) ele
, TO_NUMBER(EXTRACTVALUE(VALUE(t), 'trkpt/extensions/gpxtpx:TrackPointExtension/gpxtpx:hr'
,'xmlns="http://www.topografix.com/GPX/1/1" xmlns:gpxtpx="http://www.garmin.com/xmlschemas/TrackPointExtension/v1"')) hr
FROM activities a,
geo g,
TABLE(XMLSEQUENCE(extract(a.gpx,'/gpx/trk/trkseg/trkpt'
,'xmlns="http://www.topografix.com/GPX/1/1" xmlns:gpxtpx="http://www.garmin.com/xmlschemas/TrackPointExtension/v1"'))) t
Where a.activity_id IN(4468006769)
AND SDO_ANYINTERACT(a.geom, g.geom) = 'TRUE' /*activity has relation to reference geometry*/
), b as ( /*smooth elevation*/
Select a.*
, avg(ele) over (partition by activity_id order by time rows between 2 preceding and 2 following) avg_ele
From a
), c as ( /*last point*/
Select b.*
, row_number() over (partition by activity_id order by time) seq
, lag(time,1) over (partition by activity_id order by time) last_time
, lag(lat,1) over (partition by activity_id order by time) last_lat
, lag(lng,1) over (partition by activity_id order by time) last_lng
--, lag(ele,1) over (partition by activity_id order by time) last_ele
, lag(avg_ele,1) over (partition by activity_id order by time) last_avg_ele
From b
), d as ( /*make points*/
SELECT c.*
, strava_pkg.make_point(lng,lat) loc
, strava_pkg.make_point(last_lng,last_lat) last_loc
FROM c
), e as ( /*determine whether point is inside the polygon*/
select d.*
, 86400*(time-last_time) secs
, avg_ele-last_avg_ele ele_diff
, sdo_geom.sdo_distance(loc,last_loc,0.05,'unit=m') dist
, SDO_GEOM.RELATE(loc,'anyinteract', g_geom, tol) sdo_relate
FROM d
), f as (
select e.*
, CASE WHEN sdo_relate != lag(sdo_relate,1) over (partition by activity_id order by time) THEN 1 END sdo_diff
from e
), g as (
select f.*
, SUM(sdo_diff) over (partition by activity_id order by time range between unbounded preceding and current row) sdo_seq
from f
where sdo_relate = 'TRUE'
)
select activity_id, min(time), max(time)
, sum(dist)/1000 sum_dist
, sum(secs) sum_secs
, 3.6*sum(dist)/sum(secs) avg_speed
, sum(greatest(0,ele_diff)) ele_gain
, sum(least(0,ele_diff)) ele_loss
, 100*sum(ele_diff*dist)/sum(dist*dist) avg_grade
, min(ele) min_ele
, max(ele) max_ele
, sum(hr*secs)/sum(secs) avg_Hr
, max(hr) max_hr
, count(*) num_pts
from g
group by activity_id, sdo_seq, g.geom_length
having sum(dist)>= g.geom_length/2 /*make sure line we find is longer than half route to prevent fragmentation*/
order by 2
/
select * from table(dbms_xplan.display_cursor(null,null,'ADVANCED +IOSTATS -PROJECTION +ADAPTIVE'))
/
- In subquery a, I compare the geometry of the activity with the geometry of Swain's Lane using sdo_geom.relate() to confirm that the activity includes Swain's Lane, but then I extract all the points in the activity GPX.
- GPS is optimised for horizontal accuracy. Even so, the tolerance for determining whether the track is close to the route has to be set to 25m to allow for noise in the data (Swain's Lane is tree-lined, and has walls on both sides, that both attenuate the GPS signal). GPS elevation data is notorious for being noisy even under good conditions; you can see this in the variation of height gained on each ascent. Sub-query b calculates an average elevation across 5 track points (up to +/-2 points).
- I need to compare each point in the track to each previous point so I can do some calculations and determine when the track comes into proximity with the Swain's Lane route, subquery c uses analytic functions to determine the previous point. It is not possible to apply the analytic function to a geometry.
- Subquery e determines whether a track point is in proximity to the route. The tolerance, 25m, is set in subquery geo. Then subquery f flags where the track point is in proximity to the route and the previous one was not. Finally, subquery g maintains a running total of the number of times the track has gone close enough to the route. That becomes a sequence number for each ascent of Swain's Lane by which I can group the subsequent analytics.
Avg Ele Ele Avg Min Max
Activity Dist. Speed Gain Loss Grade Ele Ele Avg Max Num
ID MIN(TIME) MAX(TIME) (km) Secs (kmph) (m) (m) % (m) (m) HR HR Pts
---------- ------------------- ------------------- -------- ----- ------ ------- ------- ----- ------ ------ ---- ---- -----
4468006769 14:55:51 13.12.2020 14:58:17 13.12.2020 .372 147 9.1 36.1 .0 8.6 86.8 122.7 141 153 147
15:08:13 13.12.2020 15:10:28 13.12.2020 .374 136 9.9 36.2 .0 8.4 86.8 122.8 147 155 136
15:22:49 13.12.2020 15:25:18 13.12.2020 .369 150 8.9 36.2 .0 8.2 86.8 122.7 147 155 150
********** -------- -----
sum 1.116 433
On my laptop, this query takes about 10s, of which about 8s is spent on the window sort for the analytic functions, and 2s is spent working out whether the track points are in proximity to the route.
Plan hash value: 3042349692 --------------------------------------------------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows |E-Bytes|E-Temp | Cost (%CPU)| E-Time | A-Rows | A-Time | Buffers | --------------------------------------------------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 2 | | | | 5147 (100)| | 6 |00:00:20.94 | 392 | | 1 | SORT ORDER BY | | 2 | 1 | 104 | | 5147 (1)| 00:00:01 | 6 |00:00:20.94 | 392 | |* 2 | FILTER | | 2 | | | | | | 6 |00:00:20.94 | 392 | | 3 | HASH GROUP BY | | 2 | 1 | 104 | | 5147 (1)| 00:00:01 | 6 |00:00:20.94 | 392 | | 4 | VIEW | | 2 | 8168 | 829K| | 5144 (1)| 00:00:01 | 866 |00:00:20.93 | 392 | | 5 | WINDOW SORT | | 2 | 8168 | 16M| 21M| 5144 (1)| 00:00:01 | 866 |00:00:20.93 | 392 | |* 6 | VIEW | | 2 | 8168 | 16M| | 1569 (1)| 00:00:01 | 866 |00:00:20.93 | 392 | | 7 | WINDOW SORT | | 2 | 8168 | 1403K| 1688K| 1569 (1)| 00:00:01 | 10208 |00:00:09.63 | 392 | | 8 | VIEW | | 2 | 8168 | 1403K| | 1252 (1)| 00:00:01 | 10208 |00:00:06.22 | 392 | | 9 | WINDOW SORT | | 2 | 8168 | 1021K| 1248K| 1252 (1)| 00:00:01 | 10208 |00:00:06.20 | 392 | | 10 | VIEW | | 2 | 8168 | 1021K| | 1016 (1)| 00:00:01 | 10208 |00:00:06.05 | 392 | | 11 | WINDOW SORT | | 2 | 8168 | 4546K| 5040K| 1016 (1)| 00:00:01 | 10208 |00:00:00.76 | 392 | | 12 | NESTED LOOPS | | 2 | 8168 | 4546K| | 31 (0)| 00:00:01 | 10208 |00:00:00.41 | 392 | | 13 | NESTED LOOPS | | 2 | 1 | 560 | | 2 (0)| 00:00:01 | 2 |00:00:00.03 | 104 | | 14 | TABLE ACCESS BY INDEX ROWID| MY_GEOMETRIES | 2 | 1 | 112 | | 1 (0)| 00:00:01 | 2 |00:00:00.01 | 4 | |* 15 | INDEX UNIQUE SCAN | MY_GEOMETRIES_PK | 2 | 1 | | | 0 (0)| | 2 |00:00:00.01 | 2 | |* 16 | TABLE ACCESS BY INDEX ROWID| ACTIVITIES | 2 | 1 | 448 | | 1 (0)| 00:00:01 | 2 |00:00:00.03 | 100 | |* 17 | INDEX UNIQUE SCAN | ACTIVITIES_PK | 2 | 1 | | | 0 (0)| | 2 |00:00:00.01 | 4 | | 18 | XPATH EVALUATION | | 2 | | | | | | 10208 |00:00:00.37 | 288 | --------------------------------------------------------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - filter(SUM("DIST")>="G"."GEOM_LENGTH"/2) 6 - filter("SDO_RELATE"='TRUE') 15 - access("GEOM_ID"=2) 16 - filter("MDSYS"."SDO_ANYINTERACT"("A"."GEOM","G"."GEOM")='TRUE') 17 - access("A"."ACTIVITY_ID"=4468006769)
I can apply this approach to all my trips up Swain's Lane. However, I have logged 1115 ascents, and if I attempt to process them in a single SQL query I will have to do some very large window sorts that will spill out of memory (at least they will on my machine). Instead, it is faster to process each activity separately in a PL/SQL loop (see 4x_allswains2.sql).
I now have a table containing all of my ascents of Swain's Lane and I can see if I am getting faster or slower. I simply dumped the data into Excel with SQL Developer.
Unfortunately, I have discovered that I am not going faster!
No comments :
Post a Comment