XAS specialization using subclasses by mretegan · Pull Request #7 · XraySpectroscopy/nexus_definitions

mretegan · 2026-02-17T12:36:19Z

Another possibility for having fields that are acquisition mode dependent is to use subclasses as suggested here: nexusformat#1352 (comment)

This is a reasonable alternative, as it is unlikely that we will be able to define acquisition modes that will be reused by other techniques (the alternative option proposed above and implemented here: #6).

The two options were discussed in the NIAC https://www.nexusformat.org/Telco_20260211.html

maurov · 2026-02-19T09:24:43Z

@mretegan @woutdenolf @newville

My understanding of the NIAC minutes of Telco 20260211 is that both options are OK. As said many times, my position is to go for NeXus base classes representing the experimental data collection modes, as I wrote in the famous shared Google document long time ago. This solution has the strong advantage that then the base classes can be reused by other application definitions, like XMCD and other techniques. I do not understand why we are still hesitating on this. Please, let's move on.

As a first start, I propose implementing the basic modes that represent most of the XAS and other techniques' data:

NXtrans: transmission, valid for any technique and any wavelength measuring sample absorption;
NXtfy: total fluorescence yield;
NXpfy: partial fluorescence yield;
NXherfd: particular case of partial fluorescence yield with a high resolution spectrometer;

As an alternative, for the fluorescence yield (in view of the electron yield or the optical yield), we may adopt the subclass approach:

NXfy base class:
- NXfy_total
- NXfy_partial
- NXfy_herfd

But I think this is just a complication and I would just go for the first approach of base classes for each experimental collection mode.

newville · 2026-02-19T15:35:53Z

@maurov @mretegan Thanks (and sorry for the delay). I'm OK with either approach. I see the modes as mostly informative, as they suggest (but do not necessarily require) changes in the processing and analysis. But those steps are not really fixed anyway, so mode is mostly a "type hint".

mretegan · 2026-02-19T15:40:30Z

Thank you both for your input. I think that to have a complete picture, it will make sense to finish both in parallel and to create a few HDF5 examples. I will start working on this tomorrow.

maurov · 2026-02-19T16:04:45Z

@maurov @mretegan Thanks (and sorry for the delay). I'm OK with either approach. I see the modes as mostly informative, as they suggest (but do not necessarily require) changes in the processing and analysis. But those steps are not really fixed anyway, so mode is mostly a "type hint".

Hi @newville thanks for your feedback. To me, the modes are more than just informative. They tell exactly what is "mu" and how it was measured. For example, Fe K-edge XAS "mu" of Fe2O3 measured in transmission is a different thing than Fe K-edge XAS "mu" of Fe2O3 measured in HERFD. Furthermore, the "minimum required metadata" for transmission are not the same as HERFD.

maurov · 2026-02-19T16:10:03Z

Thank you both for your input. I think that to have a complete picture, it will make sense to finish both in parallel and to create a few HDF5 examples. I will start working on this tomorrow.

@mretegan for me it is very difficult to read the .nxdl.xml files directly. Would it be possible to have an automatic build of the documentation on the ESRF gitlab server? Or link here two HDF5 files generated following the two approaches. By the way, I do not think that our decision should be based on HDF5 readability, as most likely are our software tools that will read the HDF5 file, not humans.

newville · 2026-02-19T16:45:01Z

@maurov I would be cautious about being overly strict here.

Yes, data measured in different modes are different, and processing/analysis may want to do different steps based on the mode. And the mode should be stated.

And, yes, HERFD really ought to state energy analyzed (but that is also a trusted value), but is NeXuS going to say that a file is invalid if it states mode="HERFD" but does not correctly spell "analyzed energy" in every group?

mretegan · 2026-02-20T08:03:12Z

Thank you both for your input. I think that to have a complete picture, it will make sense to finish both in parallel and to create a few HDF5 examples. I will start working on this tomorrow.

@mretegan for me it is very difficult to read the .nxdl.xml files directly. Would it be possible to have an automatic build of the documentation on the ESRF gitlab server? Or link here two HDF5 files generated following the two approaches. By the way, I do not think that our decision should be based on HDF5 readability, as most likely are our software tools that will read the HDF5 file, not humans.

We have something set up, but it builds the main branch, and every time we want to switch, we need to update the CI file https://gitlab.esrf.fr/hdf5/nexus/nxxas.

You can also build locally. Go to the nexus_definitions folder and run (I use uv, but vanilla pip should work):

uv venv --python 3.12
source .venv/bin/activate
uv pip install -r requirements.txt 
make clean; make local
firefox build/manual/build/html/classes/applications/NXxas_new.html

newville · 2026-03-15T16:14:19Z

@mretegan @maurov Thanks. I don't really disagree with any of the changes here.

But getting bogged down on how to spell different types of emissions seems like both a killer of motivation and a detail that does not dramatically change the downstream use of the normalized mu data. Yes, it is helpful metadata, and can sometimes be important for comparing data. But detection modes can vary and evolve, and you may not know every possible data collection mode, and metadata can be messy, incomplete, or wrong. Still, an nxXAS group with normalized mu(E) is useful.

Please get this merged, and let us start using this. The danger that data collection and analysis applications will define their own HDF5 format and see no need to support this one is very real.

newville · 2026-03-18T01:46:20Z

@mretegan Why is Iref removed? For many people, the unbreakable coupling of an XAS measurement with a reference channel is vital. This really needs to be supported.

maurov · 2026-03-18T13:20:21Z

@mretegan Why is Iref removed? For many people, the unbreakable coupling of an XAS measurement with a reference channel is vital. This really needs to be supported.

@newville the idea is to substitute Iref with a ref subgroup that will be itself of NXXas type. In fact, it may happen that the "energy reference spectrum" is not measured as simply the "measured beam intensity after a reference foil, (= Iref)", but measured separately and in another mode/conditions. A typical case (even if it is less common) would be measuring a reference foil with an absorption edge close to the element/edge of interest in fluorescence mode and/or in a separate scan (e.g. for laboratory instruments).

newville · 2026-03-18T23:44:42Z

@mretegan Yes, of course, a separate spectrum can act as a reference signal.

Many people and many beamlines also often collect a reference spectrum in the same scan. This is more than an additional reference spectrum - it is unambiguously the same energy scan measuring multiple spectra simultaneously, and cannot be separated from the original spectrum.

And, yes, many beamlines at modern facilities have sufficiently stable energies and do not require this. But there are older spectra and older beamlines that do really need this.

newville · 2026-03-30T18:57:56Z

@mretegan Are NXAtom and NXElement really the same thing? I think of Atom as one actual object, whereas as Element is a category of Atoms. We do spectroscopies on elements, not really isolated individual atoms.

Maybe my concern is that simply that NXAtom (https://manual.nexusformat.org/classes/base_classes/NXatom.html) is hopelessly vague: "a set of atoms". That seems circular, possibly to the point of "what?". It appears to allow "ion" and maybe even a molecule. It has a thing called "position". So, it (they?) are somewhere. But it's a set.... hmmm.

OTOH, with X-ray spectroscopic methods, we really, really mean Element to be "all elemental atoms that have the same number of protons". We need to say "titanium" and mean exactly "22 protons", no more, no less.

It seems reasonable to have a class of Elements of the Periodic Table. That could include variables such as "ionization state", "isotope", and so forth. But there is a finite and enumerable list: maybe ~118 elements, each with maybe 4 ionization states, and maybe 10 isotopes.

It doesn't seem like NXAtom is exactly that... but maybe I'm missing something ;).

…scription

maurov · 2026-06-12T08:08:25Z

Personally, I would push for keeping NXelement instead of NXatom. In fact, in XAS an absorbing element may represent multiple crystallographic sites or atoms in a cluster. From a XAS point of view, an "absorbing atom" is linked to specific atomic positions, while an element is more generic and does not require atomic positions.

FYI some references to NIAC discussions on element/atom.
NXatom was already there and the NIAC proposed to use it instead of our new NXelement
nexusformat#1619 (comment) https://github.com/nexusformat/wiki/blob/master/source/content/Telco_20260429.md
Suggestion to use atom instead of element as field name
nexusformat#1619 (comment) https://github.com/nexusformat/wiki/blob/master/source/content/Telco_20260610.md

As I mentioned, while I am more in favor of keeping element, we should also remember that this definition could be used to store theoretical data, and in that context, atom is more widely used.

newville · 2026-06-15T17:21:56Z

@mretegan

What does it mean to make Energy optional? How can one use such a spectrum?
@newville Please tell me where you saw that? It might be an error.

Thanks, and sorry for the confusion. I saw b6fd81f

I wasn't fully sure what that was referring too....

Do I understand correctly that Transmission have a reference channel, but other subclasses do not? Why is that?
For transmission, iref is measured at the same time, while it is not necessarily the case for fluorescence, for example. I remember @maurov making the point that in some cases the reference can be another edge altogether. This is why it is not consistently put in all subclasses.

For transmission, a reference spectrum is sometimes measured at the same time. A reference spectrum can also be measured at the same time as other modes, say by scattering some beam before the sample or measuring literally in parallel. Both are done, maybe infrequently, but not never. It seems easy enough to allow "iref" as optional for all subclasses.

I agree that here the use of NXatom is ideal, but if we keep the field name element (which I am very much in favor of), it should not be that troubling. But that being said, I am fine either way.

I think NXatom is going to be really confusing. An atom has specific quantities (say, "position", "oxidation state", "spin", "orbital configuration", "isotope") that the category "element" does not. We definitely mean that none of those quantities are specified.

NXatom was already there and the NIAC proposed to use it instead of our new NXelement

Using "atom" in place of "element" seems really odd to me. The entire point of NeXuS is to give scientifically meaningful names to hierarchical groups. Communication is the goal. Just as the current "absorbed intensity" is hopelessly odd and confusing, using "atom" in place of "element" is poor communication.

As I mentioned, while I am more in favor of keeping element, we should also remember that this definition could be used to store theoretical data, and in that context, atom is more widely used.

Sort of. An XAS spectrum, measured or calculated represents the energy response of a large collection of atoms of the same Z (an "element"). Many (billions at least) of photons are absorbed. Each is absorbed by exactly one atom, but there is no way to distinguish which atom in the illuminated volume does the absorbing. Each absorption event is an isolaated (we're ignoring strong-field effects here) event, each lasting femto-seconds.

Anyway, just because a single in-silico atom could be used as a model does not imply that this is how experimental data should be communicated.

mretegan · 2026-07-01T12:05:20Z

I would like to merge the current branch into our main as soon as possible. This branch existed to explore the different ways to specify the acquisition modes, so even if we will still need to change the individual classes, it covers the initial purpouse.

There are a few changes that you should be aware of, and I would suggest that you look in detail at the base class NXxas and NXxas_trans. The others will be updated after.

The element uses the proposed NXelement class.
Even though there was an iref present in NXxas_trans, there is now also a subentry called reference that allows specifying a reference as an entire spectrum. The documentation should be self-explanatory.
The instrument and the monochromator are now recommended fields.
It is now possible to specify a stack of XAS spectra without losing any convenience/simplicity when specifying a single one. Following the discussion last week, it was clear that the previous definition was not covering many cases where some parameter was varied in the experiment, and it made sense to have all those spectra together. With the updated class, it is possible to have time, position on the sample (think mapping experiments), temperature, magnetic field (XMCD/XMLD), pressure, etc. as a second dimension of the dataset, again, with full backwards compatibility when this is not needed.

The rendered page is here: https://nexus-definitions.readthedocs.io/en/xas-using-inheritance

If this looks okay for you, and considering that the transmission definition is the simplest, I would suggest opening a new MR to the NeXus definitions main repo containing only NXxas and NXxas_trans and having the two approved quickly by the NIAC.

mretegan · 2026-07-02T09:11:07Z

Here are some examples of how this would look for different cases:

dataRank = 1, no nP

entry:NXentry
  definition = "NXxas_trans"
  element:NXelement
    name = "Fe"
  edge:NXabsorption_edge
    name = "K"
  is_experimental = true
  energy:NX_FLOAT[nEnergy]
  intensity:NX_FLOAT[nEnergy]  
  sample:NXsample
    name = "Fe foil"
  instrument:NXinstrument
    i0:NXdetector
      data:NX_NUMBER[nEnergy]
    itrans:NXdetector
      data:NX_NUMBER[nEnergy]
    iref:NXdetector
      data:NX_NUMBER[nEnergy]
  data:NXdata
    @signal = "intensity"
    @axes = "energy"
    energy --> /entry/energy
    intensity --> /entry/intensity

dataRank = 2, an operando time series experiment, for example

entry:NXentry
  definition = "NXxas_trans"
  element:NXelement
    name = "Fe"
  edge:NXabsorption_edge
    name = "K"
  is_experimental = true
  energy:NX_FLOAT[nEnergy]
  intensity:NX_FLOAT[nP, nEnergy]      # dataRank = 2
  sample:NXsample
    temperature:NX_FLOAT[nP]           # varies across the stack
  instrument:NXinstrument
    monochromator:NXmonochromator
      energy:NX_FLOAT[nEnergy]
    i0:NXdetector
      data:NX_NUMBER[nP, nEnergy]
    itrans:NXdetector
      data:NX_NUMBER[nP, nEnergy]
    iref:NXdetector
      data:NX_NUMBER[nP, nEnergy]
  data:NXdata
    @signal = "intensity"
    @axes = ["temperature", "energy"]
    temperature --> /entry/sample/temperature
      @AXISNAME_indices = 0
    energy --> /entry/energy
    intensity --> /entry/intensity

Using reference instead of iref

entry:NXentry
  definition = "NXxas_trans"
  element:NXelement
    name = "Fe"
  edge:NXabsorption_edge
    name = "K"
  energy:NX_FLOAT[nEnergy]
  intensity:NX_FLOAT[nEnergy]
  instrument:NXinstrument
    i0:NXdetector
      data:NX_NUMBER[nEnergy]
    itrans:NXdetector
      data:NX_NUMBER[nEnergy]
    # no iref: reference is not a simultaneous channel on this energy axis
  reference:NXsubentry
    definition = "NXxas_trans"
    element:NXelement
      name = "Cu"                      # different element than the main entry
    edge:NXabsorption_edge
      name = "K"
    energy:NX_FLOAT[nEnergy_ref]
    intensity:NX_FLOAT[nEnergy_ref]
    instrument:NXinstrument
      i0:NXdetector
        data:NX_NUMBER[nEnergy_ref]
      itrans:NXdetector
        data:NX_NUMBER[nEnergy_ref]

maurov

@mretegan I agree to merge

newville

Yes, thanks for all of this work and perseverance! I think this will be very useful for lots of kinds of XAS++ methods, and a good starting point for other spectroscopies.

emilianofonda · 2026-07-03T16:12:34Z

Thanks for introducing rank 2, this will be of great help for standardizing qexafs community exchange of data.
I agree.

Update NXxas and add child classes

24c6e81

mretegan force-pushed the xas-using-inheritance branch from a619317 to 24c6e81 Compare March 15, 2026 22:27

mretegan added 4 commits March 17, 2026 13:10

Build docs

f5739b6

Add .readthedocs.yaml

849c728

Remove optional iref from NXxas_trans

d27514e

Add a NXcollection for generic raw data

35f2bd4

mretegan added 3 commits March 29, 2026 21:29

Move NXxas and related subclasses to contributed_definitions

8e3f58f

Substitute NXelement with the existing NXatom

ef01dd9

Rename element to absorber

e8733ad

mretegan added 9 commits March 31, 2026 09:16

Re-add reference detector for transmission

50c10ee

Remove optional NXcollection group from NXxas definition

1105792

Update Read the Docs configuration to use Ubuntu 24.04 and Python 3.14

ea40940

Restrict sphinx version to smaller than 8.2.0

f03e551

Add NXatom group to absorption edge and emission line

389d401

Replace the calculated boolean with spectrum_type

16b9f4a

Update NXxas_trans definition to include 'iref' in intensity field de…

9f0b791

…scription

Add NXemission_lines

1cd1311

Remove unused symbols nTransitions and nChan from NXxas definition

2e1dbdc

mretegan added 14 commits June 23, 2026 10:08

Add NXnote to NXxas_herfd

14a24c4

Update 'depends_on' references

f9d950e

Add transformations field to sample

2974389

Requier the transformations field in the beamline_coordinate_system

790c337

Fix depends_on link

e9f848c

Update the documentation and make partial the crystal analyzers

cb294e8

Add NXcollection group for raw data

475e59f

Make the monochromator group recommended

f6119a6

Ensure full reprocessing is possible in NXxas_trans

0a5b615

Bring changes from add-nxemission-line-class branch

7628bcb

Add support for stacked spectra in NXxas and NXxas_trans definitions

9331c9f

Add reference subentry for independent spectra in NXxas_trans definition

37034cc

Add make clean command to Read the Docs

fbae449

Update the emission line group in NXxas_pfy

3ad89cf

mretegan added 2 commits July 1, 2026 14:06

Use NXelement

a9b3983

Remove optional time field from NXxas definition

5ec7580

mretegan changed the title ~~[WIP] XAS specialization using subclasses~~ XAS specialization using subclasses Jul 1, 2026

mretegan requested review from a team, maurov, newville and woutdenolf and removed request for a team July 2, 2026 13:31

maurov reviewed Jul 2, 2026

View reviewed changes

newville approved these changes Jul 2, 2026

View reviewed changes

mretegan merged commit 8d2f133 into main Jul 3, 2026
3 checks passed

Uh oh!

Conversation

mretegan commented Feb 17, 2026

Uh oh!

maurov commented Feb 19, 2026

Uh oh!

newville commented Feb 19, 2026

Uh oh!

mretegan commented Feb 19, 2026

Uh oh!

maurov commented Feb 19, 2026

Uh oh!

maurov commented Feb 19, 2026

Uh oh!

newville commented Feb 19, 2026

Uh oh!

mretegan commented Feb 20, 2026

Uh oh!

newville commented Mar 15, 2026

Uh oh!

newville commented Mar 18, 2026

Uh oh!

maurov commented Mar 18, 2026

Uh oh!

newville commented Mar 18, 2026

Uh oh!

newville commented Mar 30, 2026

Uh oh!

maurov commented Jun 12, 2026

Uh oh!

newville commented Jun 15, 2026

Uh oh!

mretegan commented Jul 1, 2026

Uh oh!

mretegan commented Jul 2, 2026

Uh oh!

maurov left a comment

Choose a reason for hiding this comment

Uh oh!

newville left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

emilianofonda commented Jul 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants