Enable separate testing of encoding for both “pre-parsed” and “parsed” output#49
Open
sideshowbarker wants to merge 1 commit intomasterfrom
Open
Enable separate testing of encoding for both “pre-parsed” and “parsed” output#49sideshowbarker wants to merge 1 commit intomasterfrom
sideshowbarker wants to merge 1 commit intomasterfrom
Conversation
e030cce to
e9803a2
Compare
61325c3 to
5ae26d8
Compare
916186a to
0b01c6a
Compare
Closed
2da023d to
41f3ec6
Compare
This change adds a new EncodingPrescanTester class, the purpose of which is to test the output of running just the meta prescan on test data — that is, when the expected result is limited to what can be determined by checking only the first 1024 bytes of the input stream.
41f3ec6 to
8822263
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The changes in the PR branch add support for correctly testing both (1) the cases where the expected result is for character coding after fully parsing the test data, as well as (2) the cases where the expected result is for the output of the encoding sniffing algoritm — and in particular, the “prescan a byte stream to determine its encoding” algorithm (aka “meta scan”).
The support is implemented by making the internal sniffing limit settable.
Relates to #47
Relates to html5lib/html5lib-tests#130