CCT225H5 Lecture Notes - Lecture 8: Data Mining, Data Profiling, Unstructured Data
Document Summary
An operating system, a server, a storage device, or network resources: virtualizing data makes storing it more efficient and cost-effective, the business focus area of big data are, data mining, data analysis, data visualization. Data: foundation for data-directed decision making: 2. Discovery: process of identifying new patterns, trends, and insights: 3. Netflix: data mining techniques, estimation analysis, determines values for an unknown continuous variable behavior or estimated future value (based on historical data, e. g. Identifies similarities and differences among data sets: a cluster analysis of a customer database groups similar attributes together to discover segments or clusters, and then examines the attributes/values that define the segment, e. g. A statistical process that finds the way to make a design, system, or decision as effective as possible. Finding the values of controllable variables that determine maximal. Predict the winners of a marathon based on gender, height, weight, hours of training. Time-series information is time-stamped information collected at a particular frequency.