parsing - Will rewriting a multipurpose log file parser to use formal grammars improve maintainability? -
TLR: If I have created a multipurpose parser with different codes for each format, then is it long Will it work better in using up to time a part of the parser code and an ANLR, pushing or specifying the same grammar?
Context: I have many benchmark log files from ~ 50 different benchmarks in my job. There are some HTML, some HTML, some CSV and much more proprietary things in XML, without any documentation imaginable. To save me and my colleagues from entering this data by hand, I wrote a parsing tool that we used to deal with all types of formats regularly with the same interface May include. Design, however, is not so clear
I have written this in Python and made a parser class. Each file format is handled as an implementation which provides its code for the reading () method of the parser. I liked the idea of only one definition of parser which uses grammar to understand each format, but I have never done it before
Is it worth my time, and whether in future Will it be easy to work with other news, when I will finish refactoring?
I can not answer your question with 100% certainty, but I can give you an opinion.
I should use a proper grammar Regex "parser" often comes similar to the input.
If the input is very similar and you already know a language that works well with the string, such as Python or Pearl, then I have the current code.
On the other hand, I can get to the party generators, like Antilor, actually shines when there are errors and inconsistencies in the input, the reason is that the formal grammar allows you to run the input stad manually Allows to focus on matching in a specific context without doing it.
In addition, if there is an error in the input stream, then I think it is often easier to deal with them using the Entrer vs. Regex. The reason for this is that if some options are available, then entrant has created the functionality to lose the right path, which includes rollback through prediction.
After saying all this, there is a lot to be said for the work code. I feel that if I want to write something then I will try to make good use of it again, how to benefit the user of the product.
Comments
Post a Comment