Skip to content

Expression incorrectly parsed in RDataFrame's Define() #22295

@bgrube

Description

@bgrube

Check duplicate issues.

  • Checked for duplicates

Description

When the expression passed to Define() includes a call to a member function whose name is identical to an existing column name, the expression parser misinterprets the code. In that situation, the parser treats the member-function call as a reference to the column, which leads to an incorrect expression being passed to the just‑in‑time compiler and ultimately produces a compilation error.

Reproducer

The following Python example illustrates the problem:

import ROOT

ROOT.gROOT.SetBatch(True)

print("this works")
df1 = ROOT.RDataFrame(10).Define("Foo", "1.0")
df1 = df1.Define("test", "ROOT::Math::PxPyPzEVector(0, 0, 0, 0).phi()")
df1.Display().Print()

print("this doesn't work")
df2 = ROOT.RDataFrame(10).Define("phi", "1.0")
df2 = df2.Define("test", "ROOT::Math::PxPyPzEVector(0, 0, 0, 0).phi()")  # <- error: the `phi()` member function is confused with the `phi` column
df2.Display().Print()

This produces the output:

this works
+-----+----------+----------+
| Row | Foo      | test     |
+-----+----------+----------+
| 0   | 1.000000 | 0.000000 |
+-----+----------+----------+
| 1   | 1.000000 | 0.000000 |
+-----+----------+----------+
| 2   | 1.000000 | 0.000000 |
+-----+----------+----------+
| 3   | 1.000000 | 0.000000 |
+-----+----------+----------+
| 4   | 1.000000 | 0.000000 |
+-----+----------+----------+
this doesn't work
input_line_52:2:76: error: no member named 'var0' in 'ROOT::Math::LorentzVector<ROOT::Math::PxPyPzE4D<double> >'
auto func2(const double var0){return ROOT::Math::PxPyPzEVector(0, 0, 0, 0).var0()
                                     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^
input_line_55:2:76: error: no member named 'var0' in 'ROOT::Math::LorentzVector<ROOT::Math::PxPyPzE4D<double> >'
auto func2(const double var0){return ROOT::Math::PxPyPzEVector(0, 0, 0, 0).var0()
                                     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^
Traceback (most recent call last):
  File "/Users/bgrube/Temp/foo/./testBugColumnFunction.py", line 15, in <module>
    df2 = df2.Define("test", "ROOT::Math::PxPyPzEVector(0, 0, 0, 0).phi()")  # <- error: the `phi()` member function is confused with the `phi` column
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.14/lib/python3.14/site-packages/ROOT/_pythonization/_rdf_pyz.py", line 514, in _PyDefine
    return rdf._OriginalDefine(col_name, callable_or_str)
           ~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^
cppyy.gbl.std.runtime_error: Template method resolution failed:
  ROOT::RDF::RInterface<ROOT::Detail::RDF::RLoopManager,void> ROOT::RDF::RInterface<ROOT::Detail::RDF::RLoopManager,void>::Define(string_view name, string_view expression) =>
    runtime_error:
RDataFrame: An error occurred during just-in-time compilation. The lines above might indicate the cause of the crash
 All RDF objects that have not run an event loop yet should be considered in an invalid state.

  ROOT::RDF::RInterface<ROOT::Detail::RDF::RLoopManager,void> ROOT::RDF::RInterface<ROOT::Detail::RDF::RLoopManager,void>::Define(string_view name, string_view expression) =>
    runtime_error:
RDataFrame: An error occurred during just-in-time compilation. The lines above might indicate the cause of the crash
 All RDF objects that have not run an event loop yet should be considered in an invalid state.

ROOT version

  • ROOT 6.32.20, built for linuxx8664gcc from tags/6-32-20@6-32-20 with c++ (GCC) 11.5.0 20240719 (Red Hat 11.5.0-11)
  • ROOT 6.38.04, built for macosxarm64 from tags/6-38-04@6-38-04 with Apple clang version 17.0.0 (clang-1700.0.13.5) std201703

Installation method

build from source, MacPorts

Operating system

Linux, MacOS

Additional context

No response

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions