Conversation
indutny
left a comment
There was a problem hiding this comment.
I like the direction of this! Great work!
src/implementation/js/node/int.ts
Outdated
| const ctx = this.compilation; | ||
| const index = ctx.stateField(this.ref.field); | ||
|
|
||
| out.push(`${index} = (${ctx.bufArg()}[${ctx.offArg()}] & 2 ** 7) * 0x1fffffe;`) |
There was a problem hiding this comment.
Nitpick: Let's use 1 << 7 instead of 2 ** 7. Mixing binary and floating point operators looks fishy.
I'm not sure that this line does what you want it to do, but I understand that this is work in progress. We can discuss it later.
There was a problem hiding this comment.
This is based on how Node.js Buffer also implements reading integers: https://github.com/nodejs/node/blob/f5512ff61ecb668c2f49b7c05d3227ef7aa5e85f/lib/internal/buffer.js#L416
I thought being consistent with core would make sense here? 🤔 I'm not sure what the pros or cons of switching to 1 << 7 would be. 🤷♂
There was a problem hiding this comment.
we could also just inline the result of the multiplication / shifting by changing this to:
| out.push(`${index} = (${ctx.bufArg()}[${ctx.offArg()}] & 2 ** 7) * 0x1fffffe;`) | |
| out.push(`${index} = (${ctx.bufArg()}[${ctx.offArg()}] & ${2 ** 7}) * 0x1fffffe;`); |
What do you think?
| } | ||
|
|
||
| case 1: { | ||
| out.push(`${index} += ${ctx.bufArg()}[${ctx.offArg()}] * 2 ** 8;`); |
There was a problem hiding this comment.
Nitpick: same here, let's use << for binary operations.
| } | ||
|
|
||
| case 2: { | ||
| out.push(`${index} += ${ctx.bufArg()}[${ctx.offArg()}] * 2 ** 16;`); |
| } | ||
|
|
||
| case 1: { | ||
| out.push(`${index} += ${ctx.bufArg()}[${ctx.offArg()}] * 2 ** 8;`); |
| } | ||
|
|
||
| case 1: { | ||
| out.push(`${index} += ${ctx.bufArg()}[${ctx.offArg()}] * 2 ** 8;`); |
| } | ||
|
|
||
| case 3: { | ||
| out.push(`${index} += ${ctx.bufArg()}[${ctx.offArg()}] * 2 ** 24;`); |
|
|
||
| switch (this.ref.byteOffset) { | ||
| case 0: { | ||
| out.push(`${index} = ${ctx.bufArg()}[${ctx.offArg()}] * 2 ** 8;`); |
|
|
||
| switch (this.ref.byteOffset) { | ||
| case 0: { | ||
| out.push(`${index} = ${ctx.bufArg()}[${ctx.offArg()}] * 2 ** 16;`); |
| } | ||
|
|
||
| case 1: { | ||
| out.push(`${index} += ${ctx.bufArg()}[${ctx.offArg()}] * 2 ** 8;`); |
| } | ||
| } | ||
|
|
||
| private readUInt24BE(out: string[]) { |
There was a problem hiding this comment.
Sounds like it could be generalized a bit. There is no runtime savings from having separate methods for 16, 24, etc bits. These functions are executed only at compile time, right?
There was a problem hiding this comment.
Yes. My thinking was to keep these separated for readability - having all the logic in a single huge if/elseif/else statement made this really hard to understand. 🤷♂ Let's see how this looks like once all the other operations are in place too.
|
@indutny I see that properties don't carry any sign information - is that a purposeful design decision in llparse? Does that mean that property access in C needs to be casted appropriately to not get the unsigned value? Am I missing something? 🤔 |
|
@arthurschreiber this is a design decision that was historically motivated by bitcode output. C has to cast fields if a signed access is required. |
|
@arthurschreiber Sorry if I am writing this question 5 years later but I thought this was an interesting idea since I wanted to write a socks5 protocol parser at some point. Would it be smart if I tried re-implementing this concept with the newer updates that have been made to llparse? |
|
Feel free to take this for a spin and see if you can get it into a mergeable state. For my purpose (parsing binary data from a stream in JavaScript), I've switched to a VM inspired parsing approach instead. I can see if I can dig up the code if you're interested? |
@arthurschreiber when I open the pull request again I sure will ask you if I have questions otherwise I should be fine. I actually attempted to simulate this library in python a couple years back so I'm sure I should have no problems re-learning things I've learned previously. |
add integer parsing based off nodejs/llparse#32 And Fix Pausing
This adds a very rough, unfinished implementation for parsing little- and big-endian integers, based on my question here: #31
See also these 2 PRs:
Intnode to parseintBE,intLE,uIntBEanduIntLEvalues. llparse-builder#1Intnodes. llparse-frontend#1This only includes the code to generate JS output (for now), and has no tests at all. 🤷♂ I'll try to flesh this out if I can find some more time.
Example input
Example output