Skip to content

add support for schema.org#263

Merged
tomkralidis merged 11 commits intomasterfrom
issue-231
Oct 18, 2025
Merged

add support for schema.org#263
tomkralidis merged 11 commits intomasterfrom
issue-231

Conversation

@tomkralidis
Copy link
Copy Markdown
Member

Fixes #231. Also adds early out for autodetection (first schema found).

@tomkralidis tomkralidis requested a review from pvgenuchten March 25, 2025 14:34
Copy link
Copy Markdown
Contributor

@pvgenuchten pvgenuchten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work Tom,

Write support

i get impression you didn't check your implementation on https://validator.schema.org/, because it still has quite some validation issues, see below.

currently dataset type is not detected in https://validator.schema.org/
when using the validator, make sure to embed json in

<script type="application/ld+json">{}</script>

I wonder if we should use some of this work inside pycsw/pygeoapi...

noticed this on distribution
image
should be @type:'schema:dataDownload'
format or encoding can be used for the mimetype
seems the validator assumes 'type': as '@type'

Read support

I notice you also added read support, i tried with

pygeometa metadata import schema-org.json --schema schema-org -v DEBUG

and got a

WARNING:pygeometa.core:Import failed: list indices must be integers or slices, not str
null
...

when debugging

from pygeometa.schemas.schema_org import SchemaOrgOutputSchema
sos = SchemaOrgOutputSchema()
f = open("./schema-org.json", "r")
f2 = sos.import_(f.read())

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/geopython/lib/python3.10/site-packages/pygeometa-0.17.dev1-py3.10.egg/pygeometa/schemas/schema_org/__init__.py", line 116, in import_
    geo = md['spatialCoverage']['geo']
TypeError: list indices must be integers or slices, not str

would be nice this if this error is reported by the command-line client
schema-org.zip

The interesting part here is that rdf typically allows a single or a list of items as content of an element, which brings us to a next topic, seems this implementation expects a json-ld serialisation of rdf, which indeed is the most common form of schema-org. However quite some implementations of schema-org use RDF-a/microdata. In theory one can also serialise schema.org as turtle or rdf/xml. To support that case, rdflib can be used to read the rdf and serialise it to json-ld, before parsing.

after fixing the spatialcoverage, next error:

  File "/geopython/lib/python3.10/site-packages/pygeometa-0.17.dev1-py3.10.egg/pygeometa/schemas/schema_org/__init__.py", line 123, in import_
    mcf['spatial']['datatype'] = 'vector'
KeyError: 'spatial'

seems the datatype is set before spatial is initialized

Comment thread pygeometa/schemas/schema_org/__init__.py Outdated
Comment thread pygeometa/schemas/schema_org/__init__.py Outdated
Comment thread pygeometa/schemas/schema_org/__init__.py Outdated
Comment thread pygeometa/schemas/schema_org/__init__.py
Comment thread pygeometa/schemas/schema_org/__init__.py Outdated
Comment thread pygeometa/schemas/schema_org/__init__.py Outdated
Comment thread pygeometa/schemas/schema_org/__init__.py Outdated
@jmckenna
Copy link
Copy Markdown
Member

seconded, good work, but testing through https://validator.schema.org/ is critical (I wish there was an API available to validate, instead of manually testing through the validator).

@pvgenuchten
Copy link
Copy Markdown
Contributor

Validator is not available as a service, but a shacl oriented test is available at https://github.com/google/schemarama/blob/main/core/test/shacl-test.js

tomkralidis and others added 8 commits October 10, 2025 23:55
* fix export to schem-org

* Update __init__.py

* Update __init__.py

* Update __init__.py

---------

Co-authored-by: Tom Kralidis <tomkralidis@gmail.com>
@tomkralidis
Copy link
Copy Markdown
Member Author

@jmckenna @pvgenuchten I dusted off this PR and pushed some updates:

import: successful import from schema-org/JSON-LD (sample) to MCF, as well as examples (thanks @pvgenuchten) in:

export: successful export of sample.yml and validated against https://validator.schema.org/ (0 errors, 0 warnings).

Comment thread pygeometa/core.py
@tomkralidis tomkralidis merged commit b14c9a5 into master Oct 18, 2025
2 checks passed
@tomkralidis tomkralidis deleted the issue-231 branch October 18, 2025 04:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

add schema.org schema

3 participants