.net - Parsing Documents with a DSL -

May 15, 2014

I am trying to make a way to go through about one million documents which are formal documents (for arguments , They are thesis documents). They are not all standardized, but are quite close. They are titles, sections, paragraphs etc. There are such subtle differences which can be harvested in English, we call a title "title", but "titrates" in French.

In this way, the best way to do this in my mind is to create an ENB with all the possible combinations of the title: = Title. For example, the title

I am not very worried about coming with EBNF. My main concern is how to achieve parsing. I have seen ANTLR, OSLO, Ironically and many others, but they do not have the expertise in deciding that they will be perfect for my work.

Then, you

Which DSL tool would you recommend to parse the documents on this scale?
Which is the most accurate DSL tool in parsing, we have to define the rules of uppercase and lowercase, what about Roman numerals and the foreign language (French) about numbers.
Is there a process / algorithm that I do not believe you have DSL? (Writing from scratch is an option, but I want to work quickly).
Has anyone tried to parse through DSL (genetic algorithm and neural network) to learn algorithms and add intelligence?
Will you use these DSL tools in the production environment?

My development platform of choice is C #. Ideally I would like to integrate the DSL device into code so that we can work with the current app.

"post-text" itemprop = "text">

I came across a device called, this is not exactly what I need, but to get the source code to see me It is also necessary to generate the need.

Search This Blog

R LISR

.net - Parsing Documents with a DSL -

Comments

Post a Comment

Popular posts from this blog

.net - C# List<T>.Find(x=>x.Rectangle.Contains(Point)) FAIL -

iphone - Smoothing a rounded stroke in Core Graphics -

c++ - QtQuick: QQmlApplicationEngine failed to load component qrc:/main.qml:23 Invalid attached object assignment -