Skip to content

Enable cfi filter usage for epubs#116

Open
wrCisco wants to merge 1 commit intojohnfactotum:mainfrom
wrCisco:proposals
Open

Enable cfi filter usage for epubs#116
wrCisco wants to merge 1 commit intojohnfactotum:mainfrom
wrCisco:proposals

Conversation

@wrCisco
Copy link
Copy Markdown
Contributor

@wrCisco wrCisco commented Mar 7, 2026

This pull request adds the possibility to pass an optional cfi parser filter to the resolveCFI methods in epub.js and to toElement and fromElements functions in epubcfi.js.

@johnfactotum
Copy link
Copy Markdown
Owner

What's the use case for this? Are you mutating the OPF?

@wrCisco
Copy link
Copy Markdown
Contributor Author

wrCisco commented Apr 6, 2026

The use case is just the one mentioned in your readme, in the "epub cfi" section:

[The CFI parser] has the ability [to] ignore nodes, which is needed if you want to inject your own nodes into the document without affecting CFIs.

In particular, I give the possibility to insert a horizontally scrollable container around block math elements (all the major browsers now support natively MathML Core, but only Firefox handles breaklines). I subclassed the View class in view.js and I use a filter in the getCFI and resolveCFI methods, but for that to work, the epub's resolveCFI method and the CFI.toElement function need to accept the filter also. The default behavior of the foliate-js library remains unchanged.

@johnfactotum
Copy link
Copy Markdown
Owner

Well, fromElements and toElements are really only intended for dealing with spine items, in the .opf file. Unless you're mutating the OPF there's no reason to include a filter here, which is that's why I asked.

Also toRange is used in view.js when the book does not implement resolveCFI. That one's missing the filter option, too.

@wrCisco
Copy link
Copy Markdown
Contributor Author

wrCisco commented Apr 10, 2026

Ok, in my proposal I changed fromElements and toElements for consistency's sake with the rest of it, but if their intended usage is for spine items only, the change is redundant, I can remove it.

Also toRange is used in view.js when the book does not implement resolveCFI. That one's missing the filter option, too.

In my project I subclassed the View class and I added a filter in getCFI and resolveCFI, so I have something like:

getCFI(index, range) {
    const baseCFI = this.book.sections[index].cfi ?? CFI.fake.fromIndex(index)
    if (!range) return baseCFI
    return CFI.joinIndir(baseCFI, CFI.fromRange(range, cfiFilter))
}
resolveCFI(cfi) {
    if (this.book.resolveCFI)
        return this.book.resolveCFI(cfi, cfiFilter)
    else {
        const parts = CFI.parse(cfi)
        const index = CFI.fake.toIndex((parts.parent ?? parts).shift())
        const anchor = doc => CFI.toRange(doc, parts, cfiFilter)
        return { index, anchor }
    }
}

I can change the PR adding cfiFilter as a class property (which would be undefined by default) and these modifications. I wanted to keep the changes at a minimum and avoid cluttering the class with a new property, so I did not include this part.

@johnfactotum
Copy link
Copy Markdown
Owner

I suppose there's nothing inherently wrong with adding a filter for when dealing with the OPF, but it just seems a bit unexpected to me. And although you're using the filter when using CFI.toElement, it's not used in CFI.fromElements.

Perhaps it'd be better to design an interface for CFI, such that epub.js doesn't directly depend on epubcfi.js, so you'd be able to easily use any implementation.

Another idea is to not use CFIs at all in epub.js, and instead just make it expose the OPF document. Though this is problematic because the CFIs of the sections have to be associated with the section objects.

@wrCisco
Copy link
Copy Markdown
Contributor Author

wrCisco commented Apr 17, 2026

I didn't use the filter in CFI.fromElements in epub.js because the only call to that function happens when the epub is firstly initialized, and the modifications that I make to the epub happen right after that (in the data event handler of the transformTarget), so I use the filter in resolveCFI to get the same elements after the changes (but the only needed use of the filter for my use case is in CFI.toRange: in CFI.toElement and in CFI.fromElements it's superfluous anyway).

A use case for which I use CFIs outside the OPF is to save and retrieve the exact positions of page breaks when the epub has a page list (if I use directly the ids of the elements referenced in the page list to build a clickable index, I get an error if the element has the applied style display: none. Also, I show on screen the exact location of the page breaks, but browsers have a hard time to pinpoint the location of invisible, empty or whitespace only elements, so I build a range that starts with the page item and contains at least one visible character, I serialize that range as CFI, and then I use it as destination for the clickable index and as annotation.value for the overlayer).

Perhaps it'd be better to design an interface for CFI, such that epub.js doesn't directly depend on epubcfi.js, so you'd be able to easily use any implementation.

Should the implementation to be used therefore be passed as argument to the EPUB constructor (or init) method? But shouldn't the filter still be passed in the call to resolveCFI, or am I wrong?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants