Web Mining Features and Functions
Web mining is the integration of information gathered by traditional data mining methodologies and techniques with information gathered over the World Wide Web.
- Content mining extracts patterns from online information, such as HTML files, images, or e-mails
- Web structure mining focuses on using the analysis of the link structure of the web to identify preferable documents
- Web usage mining for user interactions whenever requests for resources are received
- Concise notation based on familiar ANSI standard SQL, including joins, grouping, sorting, and set operations
- Full featured IDE includes syntax highlighting, graphical execution, real time result delivery, and network monitoring
- Extraction techniques for unstructured, semi-structured, and structured data
- Reads and writes common file formats such as HTML, XML, PDF, DOC, CSV, TSV, images, databases, etc.
- Integration options include Java, .NET, ActiveX, C++, and allowing queries to become Web services
- Facilities for error trapping and reporting
- XML, including extraction of data using XPath and transformation of input or output using XSLT
- Transparent support for Web functionality such as scripts, forms, cookies, user agents, frames, tables, authentication, etc.
- Development environment for organizations to create their own Web mining solutions
- Parallel deployment engine extracts information at high speed
Data, Text, and Web Mining Features and Functions