Innovative solutions for content and information mining

In today’s highly competitive and fast paced world of Information Technology, an in-depth knowledge of your business environment provides the right way to competitive advantage. Mostly based on information exchange and sharing, the current economy requires highly specialized knowledge management skills at all organizational levels.

Meeting the needs for knowledge detection from unstructured information

With the overwhelming expansion of corporate intranets and the introduction of more powerful computers, since the mid-nineties there has been an increased need for tools and solutions capable of providing a simpler and more intuitive access to information. However, the efforts spent on textual data sources, which represent 80% of all relevant information for a company, have only produced partial results. This is due both to qualitative problems, related to real difficulties in automating information retrieval and management, and to quantitative problems, represented by the ever growing document amount. Business search engines - by answering queries with long lists of mainly irrelevant documents - have made information retrieval even more complex, instead of simplifying it.

Improving business efficiency with advanced search capabilities

In general, poor knowledge management practices can negatively affect a company’s productivity and jeopardize its market position. The need to integrate data with unstructured “hidden” information - which is worthless until it becomes easily accessible - makes corporate governance even more complicated. It is widely known that most of the relevant information can only be detected by reading between the lines. E-mail messages, telephone conversations or Web pages converted into usable knowledge can represent an inexhaustible and always up-to-date data source, on which companies can base their decision making processes. In spite of this, about 50% of the time spent on information treatment activities is dedicated to searching and looking up documents, 10% is related to unsuccessful searching, while another 20% is taken up by total or partial text rewriting.

Text Mining and Semantics: technology and strategies for knowledge management best practices

Text Mining and Semantics provide the right response to Information Overload issues resulting from information management activities. Text Mining and Semantics help manage a wide range of texts automatically, by classifying them according to thematic categories and by extracting the most relevant information from them. Text Mining methodologies involve Linguistic and Statistical Analysis.

Linguistic Analysis is used to process the morphological, syntactic, logic-functional and semantic structure of a text and to identify its key elements. Firstly, morpho-syntactic analysis classifies each word from a grammatical point of view, so as to reduce the number of concepts describing a text. For instance, in a text concerning politics or business proper names of people, places and organizations can be used to easily identify a specific thematic category, while adjectives in an e-mail or a blog message can indicate whether a product or a service is positively or negatively evaluated. Secondly, logic-functional analysis helps identify who is doing what, how, when and where. Finally, semantic analysis interprets the underlying meaning of each single word. Through appropriate data projections, it is possible to assess with an excellent approximation the customer satisfaction level in relation to specific business initiatives or campaigns. If a company name or brand is placed in the “word space” and assessed against criteria such as “good”, “beautiful”, “young” and “reassuring”, you will be able to discover the real attitude of actual or potential users towards the products or services offered by that company or brand (Semiometry and Brand Analysis).

Statistic Analysis is used to assign documents both to predefined and customized thematic categories (Categorization) and to classification schemes which are not known in advance (Clustering). In the case of Categorization, an article can automatically be associated to thematic categories such as politics, business, sports or arts, while an e-mail message can automatically be redirected to the relevant company department. In the case of Clustering, texts are classified through spontaneous aggregation. For instance, claims and suggestions related to a company’s products and services are aggregated in various ways, so that the company can take advantage of new and original perspectives and trends.