UNIT 5 | Data Warehousing and Data Mining Notes | AKTU Notes



    1. Aggregation

    Aggregation is the process of summarizing detailed data into higher-level information. It is commonly used in data warehouses to perform calculations such as sum, average, count, maximum, and minimum.

    Example: Daily sales data aggregated into monthly sales.

    Hinglish: Aggregation ka matlab hai detailed data ko summarize karke high-level information banana.


    2. Historical Information

    Historical information refers to data stored over long periods of time. Data warehouses keep historical data to analyze trends and patterns over time.

    Example: Sales data of the last 5 years.

    Hinglish: Historical data wo data hota hai jo past me collect kiya gaya ho aur analysis ke liye store kiya jata hai.


    3. Query Facility

    Query facility allows users to retrieve and analyze data stored in the data warehouse using query languages like SQL.

    Users can perform:

    • Search data
    • Filter records
    • Generate reports

    4. OLAP Functions and Tools

    OLAP (Online Analytical Processing) is used for multidimensional data analysis in data warehouses.

    • Roll-Up: Aggregates data to higher level
    • Drill-Down: Shows more detailed data
    • Slice: Selects one dimension
    • Dice: Selects multiple dimensions
    • Pivot: Rotates data view

    5. OLAP Servers

    OLAP servers process analytical queries and store multidimensional data structures to support business intelligence analysis.


    6. ROLAP (Relational OLAP)

    ROLAP stores data in relational databases and performs OLAP operations using SQL queries.

    Advantages:

    • Scalable
    • Works with large datasets

    7. MOLAP (Multidimensional OLAP)

    MOLAP stores data in multidimensional cube structures for fast query performance.

    Advantages:

    • Very fast query response
    • Efficient aggregation

    8. HOLAP (Hybrid OLAP)

    HOLAP combines the features of ROLAP and MOLAP. Detailed data is stored in relational databases and aggregated data is stored in multidimensional cubes.


    9. Data Mining Interface

    Data mining interface provides tools and graphical interfaces that allow users to interact with data mining systems.

    Examples:

    • Dashboards
    • Visualization tools
    • Interactive reports

    10. Security

    Security ensures that only authorized users can access or modify data in the data warehouse.

    • User authentication
    • Access control
    • Data encryption

    11. Backup and Recovery

    Backup and recovery mechanisms protect data warehouse systems from data loss.

    • Regular data backups
    • Disaster recovery systems
    • Data restoration

    12. Tuning Data Warehouse

    Tuning improves the performance of a data warehouse by optimizing queries, indexes, and storage methods.

    Techniques include:

    • Index optimization
    • Query optimization
    • Partitioning

    13. Testing Data Warehouse

    Testing ensures that the data warehouse works correctly and provides accurate results.

    Types of testing:

    • Data quality testing
    • Performance testing
    • Integration testing

    14. Warehousing Applications

    Data warehousing is used in many industries for decision-making and data analysis.

    • Banking and Finance
    • Healthcare
    • E-commerce
    • Telecommunications
    • Retail

    15. Web Mining

    Web mining is the process of extracting useful information from web data such as web pages, web logs, and user behavior.

    Types of Web Mining:

    • Web Content Mining
    • Web Structure Mining
    • Web Usage Mining

    16. Spatial Mining

    Spatial data mining discovers patterns and relationships in spatial data such as maps and geographic data.

    Example: Finding patterns in geographical data.


    17. Temporal Mining

    Temporal mining analyzes data that changes over time.

    Example: Stock market trends or weather patterns.


    Conclusion

    Data visualization and analytical tools in data warehousing help organizations analyze large datasets effectively. Technologies such as OLAP, data mining interfaces, and web mining support better decision-making and business intelligence.

    No comments:

    Post a Comment