First of all: Hello everybody!
Let me first tell you a little about this new application called 'Topicalizer'. The basic idea was to create a tool with which a user can analyse texts regarding multifarious linguistic aspects including type / token frequency, lexical density, sentence and paragraph structure, readability, word frequencies, collocations and possible keywords.
Furthermore, with Topicalizer a user can have an abstract automatically generated for a text in order to be given some insight into what the text might be about.
A text can be provided either as plain text or by specifying the URL where the text is to be found. If Topicalizer encounters a text that is encapsulated in valid HTML, XHTML or XML it is capable of parsing and subsequently analysing it as well. However, the parser does not do too well yet concerning invalid markup code, so please use the URL option with some care!
Moreover, most of the time the plain text option will yield more accurate results anyway, since with plain texts you do not have design and navigation elements like menus, headers and footers that normally occur in (X)HTML pages. This does not pose any problems, if the text is long enough and those elements therefore are of no statistical relevance. However, if the text is rather short, these elements might indeed have an impact on the results.
By specifying the language of the text, you tell Topicalizer to use language-specific settings like stop words, syllable structure etc., which renders the results much more relevant.
Another interesting feature of Topicalizer is its API (
http://www.topicalizer.com/api/), which can be used to embed the functionality of Topicalizer in any kind of application that has access to the Internet.
Now you might ask what this software could possibly be used for. The most obvious purpose is to offer web authors a tool with which they can optimise their texts regarding complexity, readability and last but not least for search engines.
Apart from the analysis features, Topicalizer automatically suggests keywords and creates an abstract for a given text and therefore can be used for tagging texts with semantic information.
Enjoy this software and stay tuned, as there are more features still to come.
Björn Wilmsmann, developer and operator of Topicalizer