Skip to content

Convert to js#291

Merged
lahma merged 23 commits into
sebastienros:mainfrom
jogibear9988:convertToJs
Jul 20, 2022
Merged

Convert to js#291
lahma merged 23 commits into
sebastienros:mainfrom
jogibear9988:convertToJs

Conversation

@jogibear9988
Copy link
Copy Markdown
Contributor

@jogibear9988 jogibear9988 commented Jul 4, 2022

rebase of the branch #154

@jogibear9988
Copy link
Copy Markdown
Contributor Author

@adams85 now you can review

@jogibear9988
Copy link
Copy Markdown
Contributor Author

maybe now I could also be based on AstVisitor, but I don't know atm... I've only rebased what I've programed long time ago

@jogibear9988 jogibear9988 mentioned this pull request Jul 4, 2022
@adams85
Copy link
Copy Markdown
Collaborator

adams85 commented Jul 5, 2022

I took a quick glance, so just some random thoughts before going into the details:

  • maybe now I could also be based on AstVisitor, but I don't know atm...

    You definitely should. It wouldn't be wise to duplicate the visitation dispatch logic because that would mean another location to keep in mind when changes are made later on.

  • The name ToJavascriptConverter is pretty weird. I suggest AstToJavascriptConverter, JavascriptWriter or something like this.

  • Declaring a class for a single extension method (ToJavascriptConverterExtension) doesn't look good either. We already have a NodeExtensions class, I think that would be a perfect place for the ToJavascript method.

  • I would keep the public API as minimal as possible, so I'd definitely keep the visitor internal and would introduce some options type. Take a look at AstJson.Options for ideas.

  • It might also be nice to separate the writing logic from the visitor as it is done in the case of JsonWriter and AstToJsonConverter. But, anyway, we should provide some API which allows consumers to write into a TextWriter because that is the standard way when it comes to generating text in .NET. A StringBuilder can always be wrapped in a TextWriter (see StringWriter) but not the other way around.

Ok, I think this is enough for starters. 😄

@adams85
Copy link
Copy Markdown
Collaborator

adams85 commented Jul 5, 2022

I would keep the public API as minimal as possible, so I'd definitely keep the visitor internal and would introduce some options type. Take a look at AstJson.Options for ideas.

On second thoughts, we can't really keep the visitor internal as extensions (like JSX) should be supported too. So I think our best bet is to copy the pattern of AstJson and AstToJsonConverter (+ JsxAstJsonConverter) like

  • AstJson -> JavascriptWriter
  • AstJson.Options -> JavascriptWriter.Options
  • AstToJsonConverter -> AstToJavascriptConverter
    etc.

If we did this, JavascriptWriter could be a better place for the ToJavascript convenience method.

Copy link
Copy Markdown
Collaborator

@adams85 adams85 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The feature certainly needs some more polish but there are no major problems apart from one thing: properly handling the cases where expressions need to be wrapped in parentheses. This seem pretty unreliable yet, so needs some more thought.

protected StringBuilder _sb = new StringBuilder();
private int _indentionLevel = 0;

private readonly List<Node> _parentStack = new List<Node>();

This comment was marked as resolved.

}
}

public class ToJavascriptConverter

This comment was marked as resolved.

Comment on lines +10 to +16
public static class ToJavascriptConverterExtension
{
public static string ToJavascript(this Node node, bool beautify = false)
{
return ToJavascriptConverter.ToJavascript(node, beautify);
}
}

This comment was marked as resolved.

VisitNodeList(program.Body, appendAtEnd: ";", addLineBreaks: true);
}

protected virtual void VisitUnknownNode(Node node)

This comment was marked as resolved.

return _parentStack[_parentStack.Count - 1 - offset];
}

public virtual void Visit(Node node)

This comment was marked as resolved.

}
}

protected virtual void VisitClassDeclaration(ClassDeclaration classDeclaration)

This comment was marked as duplicate.

Visit(function.Body);
}

protected virtual void VisitClassExpression(ClassExpression classExpression)

This comment was marked as duplicate.

{
Visit(breakStatement.Label);
}
Append("break");

This comment was marked as duplicate.

return _sb.ToString();
}

public bool IsAsync(Node node)

This comment was marked as resolved.

}
}

protected virtual void VisitImport(Import import)

This comment was marked as resolved.

@lahma
Copy link
Copy Markdown
Collaborator

lahma commented Jul 6, 2022

Thanks @adams85 for going through this with such a detail! Just that we are on same page how to proceed as I know @jogibear9988 might have his hands full with esprima-next too. Some alternatives at least:

  1. merge as-is, @adams85 's comments are a TODO list for someone willing to do cleanup - we will have technical debt in main
  2. @jogibear9988 has the time to improve and does a round with the code improving things that he sees fit, then merge
  3. someone improves this PR further like above, then merge
  4. mystery solution

It's a great feature so I'd hope we can agree on some plan to make sure we get it into main, whether "perfect" or incrementally improving.

@jogibear9988
Copy link
Copy Markdown
Contributor Author

Thanks for the great review.
I only rebased my old code, so I expected issues.
Don't know when I find time again to work on it (I'm not full with esprima.next, but my https://github.com/node-projects/web-component-designer needs many of my time atm. and also work :-) )

For me one critical point is, the visitor should be public, and the methods virtual, cause i wanted to overwrite the logic wich some ast nodes are written as javascript, cause we use TemplateLiteralStrings in our component framework, with html and css code, wich I also like to parse and minify.

I think if someone works on it, I'll be fine, if not, I'll look when I've time and will do.
If someone works on it, post here, so we don't do the work twice

@adams85
Copy link
Copy Markdown
Collaborator

adams85 commented Jul 6, 2022

Since I've already got into this and I happen to have some free time now, I could as well make an attempt on finishing the feature. If it's okay with you, I'd just need write permissions to the PR.

For me one critical point is, the visitor should be public, and the methods virtual, cause i wanted to overwrite the logic wich some ast nodes are written as javascript

You mean you want to subclass it in another assembly? If so, maybe we can handle that using an InternalsVisibleTo attribute.

@lahma
Copy link
Copy Markdown
Collaborator

lahma commented Jul 6, 2022

Since I've already got into this and I happen to have some free time now, I could as well make an attempt on finishing the feature. If it's okay with you, I'd just need write permissions to the PR.

As I'm just "a contributor with some merge capabilities", I'm afraid I can't give the permissions. But if @jogibear9988 is OK with it, maybe you can branch from this PR into another PR adding fixes on top. I have no idea how the attribution of changes will flow after that 😬

@jogibear9988
Copy link
Copy Markdown
Contributor Author

Since I've already got into this and I happen to have some free time now, I could as well make an attempt on finishing the feature. If it's okay with you, I'd just need write permissions to the PR.

don't know how to do this. I've added you to my fork, will this help?

For me one critical point is, the visitor should be public, and the methods virtual, cause i wanted to overwrite the logic wich some ast nodes are written as javascript

You mean you want to sublass it in another assembly? If so, maybe we can handle that using an InternalsVisibleTo attribute.

Don't like this approach. Why should esprima-net have an internals-visible-to my assembly? Why not make it public? Think the best way for people to modify the JS output of some parts would be to override the visitor for the node they wanted to change.
(Okay one more Info, if someone does not know: If you'd like to access internals of an assembly without "InternalVisibleTo" there is this nuget package: https://github.com/aelij/IgnoresAccessChecksToGenerator, but as I see, I'd like to see it public)

@jogibear9988
Copy link
Copy Markdown
Contributor Author

Since I've already got into this and I happen to have some free time now, I could as well make an attempt on finishing the feature. If it's okay with you, I'd just need write permissions to the PR.

As I'm just "a contributor with some merge capabilities", I'm afraid I can't give the permissions. But if @jogibear9988 is OK with it, maybe you can branch from this PR into another PR adding fixes on top. I have no idea how the attribution of changes will flow after that 😬

For me this would be full okay. You also can close, create a new one... as you like. But I also added him to my fork

@adams85
Copy link
Copy Markdown
Collaborator

adams85 commented Jul 6, 2022

@jogibear9988
Copy link
Copy Markdown
Contributor Author

this is checked
image

@adams85
Copy link
Copy Markdown
Collaborator

adams85 commented Jul 6, 2022

remote: Permission to jogibear9988/esprima-dotnet.git denied to adams85.
fatal: unable to access 'https://github.com/jogibear9988/esprima-dotnet/': The requested URL returned error: 403

Maybe it's because I'm not added as a maintainer here?

@jogibear9988
Copy link
Copy Markdown
Contributor Author

@adams85 I've added you to my fork, but you need to accept the invite

@jogibear9988
Copy link
Copy Markdown
Contributor Author

I could not add you here, I'm also not a maintainer here.

@adams85
Copy link
Copy Markdown
Collaborator

adams85 commented Jul 6, 2022

Ok, it works now. Thanks!

@adams85
Copy link
Copy Markdown
Collaborator

adams85 commented Jul 6, 2022

Don't like this approach. Why should esprima-net have an internals-visible-to my assembly? Why not make it public? Think the best way for people to modify the JS output of some parts would be to override the visitor for the node they wanted to change.

Generally, it's a good idea to minimize the API surface area so we can make modification more freely without breaking consumer code. However, you have a good point here, so the visitor will remain public then.

BTW, what about this issue? Can Identifier.Name ever be null? I suspect it can't. Do any of you know it for sure?

@jogibear9988
Copy link
Copy Markdown
Contributor Author

@adams85
Copy link
Copy Markdown
Collaborator

adams85 commented Jul 6, 2022

Thanks for the clarification. I'll fix this too.

@CharlieEriksen
Copy link
Copy Markdown
Contributor

Wow @jogibear9988, what a beautiful piece of work. I've found myself having to use js-beautify/prettifier in my processing pipeline. Both introduce significant performance issues and are not super reliable.

This will be epic. I'll try to take a spin on your branch over the weekend and see if it works as-is on my workloads. Fingers crossed!

@adams85
Copy link
Copy Markdown
Collaborator

adams85 commented Jul 8, 2022

@CharlieEriksen It's a nice work indeed but please not that this is still a WIP. There are known cases where the AST is transformed incorrectly (e.g. operator precedence handling is pretty unreliable currently). The public API is going to change too. Hopefully, in a few days we'll have a maturer version.

@adams85
Copy link
Copy Markdown
Collaborator

adams85 commented Jul 18, 2022

This was a harder nut to crack than it looked at first sight (or even at second) but finally I proudly present the improved AST to JS conversion! 🎉 (Truth be told, there was so many changes, fixes and improvements that I virtually ended up with a complete rewrite...)

@jogibear9988 Please re-review it, especially the public API, which is mainly defined in the AstToJavascript class now. I also overrided Node.ToString() and added a meaningful debugger display format as it's done in the Roslyn AST. Some further things worth mentioning:

  • The converter logic was split into two components: AstToJavascriptConverter and JavascriptTextWriter. The former defines the structure of the output by traversing the AST, the latter is responsible for code formatting. To make this work, AstToJavascriptConverter provides rich context and formatting hints for JavascriptTextWriter. (See JavascriptTextWriter.WriteContext and the flags enums). The main benefit of this design that it allows inheritors to implement other code styles/customize existing ones without changing the visitor part. It still leaves the door open for extensions like JSX, however such cases obviously require extending both the visitor and writer classes.
  • We have two formatters currently: the base JavascriptTextWriter, which produces minimal output, and the KnRJavascriptTextWriter subclass, which implements K&R-style formatting (also provides some knobs to fine tune the formatting rules - see the related Options class).

I also adjusted your existing test cases and added another set of tests. The complete test suite used for testing the parser was reused: the fixture files get parsed, converted back to text, then re-parsed, finally the two resulting ASTs are compared (only structure and node types for now). So we can say that the feature is pretty comprehensively tested, however getting parentheses rules right is hard af so I'm still not 100% sure that every corner case and quirk is covered. But I think it should be ok for an initial version.

@CharlieEriksen Now you may give it a go, the foundations are there for both minimizing and prettifying.

Copy link
Copy Markdown
Collaborator

@lahma lahma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, you really went the extra mile with this implementation, seems like powerful stuff. I left some comments, mostly about code style as I don't have much other to give in the JS writing area.


partial class AstToJavascriptConverter
{
#region Statements
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I don't think we use regions

protected TokenFlags LastTokenFlags { [MethodImpl(MethodImplOptions.AggressiveInlining)] get; [MethodImpl(MethodImplOptions.AggressiveInlining)] private set; }
protected bool WhiteSpaceWrittenSinceLastToken { [MethodImpl(MethodImplOptions.AggressiveInlining)] get; [MethodImpl(MethodImplOptions.AggressiveInlining)] private set; }

#region White-space
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: regions are not preferred in codebase


public JavascriptTextWriter(TextWriter writer, Options options)
{
_writer = writer ?? throw new ArgumentNullException(nameof(writer));
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should prefer EsprimaExceptionHelper for consistency

Copy link
Copy Markdown
Collaborator

@adams85 adams85 Jul 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, EsprimaExceptionHelper isn't used consistently throughout the code base.

As a matter of fact, I fail to see why it exists at all (when throw expressions are a thing since C# 7.0). What do we gain from e.g. Description = description ?? ThrowArgumentNullException<string>(nameof(description)); over Description = description ?? throw new ArgumentNullException(nameof(description));? I mean it's not even shorter and you need to type out the generic parameter too. Plus it breaks static analysis. (Not even DoesNotReturn solves this completely as the compiler still complains when it's used in a switch case branch.)

So how about getting rid of EsprimaExceptionHelper instead? I'm willing to do that if you sign off on it.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's for JIT mostly, JIT cannot inline methods that have throw in them (call stack would be wrong).

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, I learned something today again! Thanks for this valuable info. I'll revise throw usage and change it to the helper method where inlining justifies it.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, method must of course be quite small to be inline candidate (that's why some usages would be more for consistency). Also I think that the return of T for some throw helpers might not work when wanting the "throw helper pattern", they might need to be void, but not an issue for this PR, just a mental note.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I annotated all my methods with [MethodImpl(MethodImplOptions.AggressiveInlining)] which I think it's beneficial to inline. Don't really care whether JIT inlines the rest or not, so in those cases I just use plain throws. As I see, this pattern is followed in other parts of the code (e.g. in JavascriptParser). So, are we good or do you insist on using the throw helpers everywhere?

Copy link
Copy Markdown
Collaborator

@lahma lahma Jul 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need, we can re-evaluate if needed later.


if (options is null)
{
throw new ArgumentNullException(nameof(options));
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should prefer EsprimaExceptionHelper for consistency

Comment thread src/Esprima/Utils/AstToJavascript.cs Outdated
{
if (writerFactory is null)
{
throw new ArgumentNullException(nameof(writerFactory));
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should prefer `EsprimaExceptionHelper`` for consistency

case TokenType.Template:
break;
default:
throw new InvalidOperationException();
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should prefer EsprimaExceptionHelper for consistency, argument out of range?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can't really be an argument out of range because we don't switch on an argument here but a class field. Maybe NotImplementedException is something to consider.


if (!string.IsNullOrWhiteSpace(indent))
{
throw new ArgumentException("Indent must be null or white-space.", nameof(indent));
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should prefer EsprimaExceptionHelper for consistency

/// </summary>
public class KnRJavascriptTextWriter : JavascriptTextWriter
{
public new record class Options : JavascriptTextWriter.Options
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the new here feels bad, should this just be KnRJavascriptTextWriterOptions? In deal world could just pass suitable Options instance to JavascriptTextWriter that configures KnR style, but seems that it wouldn't be that easy to implement.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I myself would have gone with KnRJavascriptTextWriterOptions but I wanted to keep it consistent with existing code: AstJon has had this nested option class thing, I just followed the pattern. So I suggest leaving it as is (or we should change it everywhere but that would mean unnecessary BCs).

In deal world could just pass suitable Options instance to JavascriptTextWriter that configures KnR style, but seems that it wouldn't be that easy to implement.

We need to keep the design extensible (for other code styles or AST extensions), so we have to use the strategy design pattern here in one form or the other. The current design does that and I can't really think of a simpler one. If we had a single writer class which accepted multiple types of options (it must be multiple types since the possible options for all possible code styles are not known in advance), then the writer would need to make decisions based on the received option type and eventually it would implement the same strategy pattern we have now, just internally.


if (!string.IsNullOrWhiteSpace(extendedOptions.Indent))
{
throw new ArgumentException("Indent must be null or white-space.", nameof(extendedOptions));
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EsprimaExceptionHelper

}
}

#region Arrays
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

regions...

@CharlieEriksen
Copy link
Copy Markdown
Contributor

Awesome work folks!

Just for background on my testing, here's how my pipeline works in part:

  1. I have some test harnesses that go and browser some large JS application, like the AWS console
  2. For each JS file loaded by the browser, that's sent to my code
  3. It will parse the JS with esprima-dotnet, to ensure that the "baseline" is parseable, or detect if there's some code that makes it trip up.
  4. It then passes it onto the prettifier, which now first uses this code for prettifying, or fall back to js-beautify or prettier
  5. It will then take the prettified code, and parse the code again like in step 3 to make sure the prettifying didn't break anything

I've run it through a part of my normal test set so far. I ran into one file that failed step 5:
https://a.b.cdn.console.awsstatic.com/e195e08e418def022e7ca90f52a774ddd0b54868d7c7d4162697e865948d6c10/vendor-react.js

It seems to generate the following:
for (var e = vi, t = e.length, n = "value" in gi ? gi.value : gi.textContent, r = n.length, o = 0; o < t && e[o] === n[o]; o++)

prettier generates:

 for (
        var e = vi,
          t = e.length,
          n = ("value" in gi) ? gi.value : gi.textContent,
          r = n.length,
          o = 0;
        o < t && e[o] === n[o];
        o++
      );

It seems like it's not putting the conditional expression in parentheses, which means that the in statement is invalid.

@CharlieEriksen
Copy link
Copy Markdown
Contributor

I'm also noticing that with the default beauty mode enabled, you end up with code that initially get indented in a pretty nice way. But at some point, it seems to give up on indentation for the most part.
image

Copy link
Copy Markdown
Collaborator

@lahma lahma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look good to me, and we can always fine-tune later if necessary.

@lahma lahma enabled auto-merge (squash) July 20, 2022 15:26
@lahma lahma merged commit 1cd493f into sebastienros:main Jul 20, 2022
@lahma
Copy link
Copy Markdown
Collaborator

lahma commented Jul 20, 2022

Great work again folks, great to see work iterated and improved based on feedback cycle 🚀

@lahma
Copy link
Copy Markdown
Collaborator

lahma commented Jul 20, 2022

I've published the latest main on NuGet (including this PR) as version 3.0.0-beta-3.

@adams85 adams85 mentioned this pull request Jul 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants