add support for schema.org#263
Conversation
There was a problem hiding this comment.
Nice work Tom,
Write support
i get impression you didn't check your implementation on https://validator.schema.org/, because it still has quite some validation issues, see below.
currently dataset type is not detected in https://validator.schema.org/
when using the validator, make sure to embed json in
I wonder if we should use some of this work inside pycsw/pygeoapi...
noticed this on distribution

should be @type:'schema:dataDownload'
format or encoding can be used for the mimetype
seems the validator assumes 'type': as '@type'
Read support
I notice you also added read support, i tried with
pygeometa metadata import schema-org.json --schema schema-org -v DEBUG
and got a
WARNING:pygeometa.core:Import failed: list indices must be integers or slices, not str
null
...
when debugging
from pygeometa.schemas.schema_org import SchemaOrgOutputSchema
sos = SchemaOrgOutputSchema()
f = open("./schema-org.json", "r")
f2 = sos.import_(f.read())
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/geopython/lib/python3.10/site-packages/pygeometa-0.17.dev1-py3.10.egg/pygeometa/schemas/schema_org/__init__.py", line 116, in import_
geo = md['spatialCoverage']['geo']
TypeError: list indices must be integers or slices, not str
would be nice this if this error is reported by the command-line client
schema-org.zip
The interesting part here is that rdf typically allows a single or a list of items as content of an element, which brings us to a next topic, seems this implementation expects a json-ld serialisation of rdf, which indeed is the most common form of schema-org. However quite some implementations of schema-org use RDF-a/microdata. In theory one can also serialise schema.org as turtle or rdf/xml. To support that case, rdflib can be used to read the rdf and serialise it to json-ld, before parsing.
after fixing the spatialcoverage, next error:
File "/geopython/lib/python3.10/site-packages/pygeometa-0.17.dev1-py3.10.egg/pygeometa/schemas/schema_org/__init__.py", line 123, in import_
mcf['spatial']['datatype'] = 'vector'
KeyError: 'spatial'
seems the datatype is set before spatial is initialized
|
seconded, good work, but testing through https://validator.schema.org/ is critical (I wish there was an API available to validate, instead of manually testing through the validator). |
|
Validator is not available as a service, but a shacl oriented test is available at https://github.com/google/schemarama/blob/main/core/test/shacl-test.js |
* fix export to schem-org * Update __init__.py * Update __init__.py * Update __init__.py --------- Co-authored-by: Tom Kralidis <tomkralidis@gmail.com>
575a1f3 to
55be744
Compare
|
@jmckenna @pvgenuchten I dusted off this PR and pushed some updates: import: successful import from schema-org/JSON-LD (sample) to MCF, as well as examples (thanks @pvgenuchten) in:
export: successful export of |
Fixes #231. Also adds early out for autodetection (first schema found).