CS235 Lecture Notes - Lecture 10: Factiva, Digital Footprint, Unstructured Data
![](https://new-preview-html.oneclass.com/Gx0M5K2doWlRjLLR9BPzjBk1p4YyV6JE/bg1.png)
Big Data: The Future of Research Methods? (guest lecture - Kira Williams)
● What is the… purpose/logical form/pragmatic requirements/criteria of adequacy/role of
generality in/of explanation?
● Describe, explain, predict
● Two types of explanatory questions… why necessary and how possible
● Three modes of causality
○ Covering law models (conclusion must be true if premises are true)
○ Inductive statistical models (conclusion may be false even if premises are true…
events are correlated but not necessarily caused by each other)
○ Statistical relevance models
● Problem for inductive explanations - is the empirical fact in fact explanatory?
● Problem for deductive explanations - does the explanatory hypothesis have empirical
support?
● Regularity - when one thing happens, another happens
● Inductive regularity: causal relations consist only in patterns of regular associations
between things; one thing is typically associated with another
● Must be able in stat analysis to explain why numbers turn out the way they do (to
interpret results and relate back to real world)
● Data involve a domain of items, events, individual or other objects, including properties
of these items and their states
● Role of stat explanation - empirical evaluation of causal hypotheses, preliminary method
for probing complex range of comm phenomena for regularities
Big Data
● Term to describe increasing complexity of data - traditional quantitative methods become
inadequate to handle them - focused on unstructured data which can be difficult to make
sense of (fb, twitter)
● Diff than normal data…
○ Volume - watches/records rather than samples
○ Velocity - real time
○ Machine learning - pattern detection
○ Digital footprint - reduces cost of data
● Four characteristics
○ Volume, variety, reliability, validity
● Examples
○ Government - surveillance
○ Retail - customer data and product design
○ Science - climate stimulation
○ Sports - improve training
○ Technology - indexing searches
● Benefits: increase of data avail, faster, variety, linking of systems, new approaches to
problems, more work placed on computers - reduces human error and needed skills
(factiva is example of comm studies)
find more resources at oneclass.com
find more resources at oneclass.com
Document Summary
Big data: the future of research methods? (guest lecture - kira williams) Two types of explanatory questions why necessary and how possible. Covering law models (conclusion must be true if premises are true) Inductive statistical models (conclusion may be false even if premises are true events are correlated but not necessarily caused by each other) Regularity - when one thing happens, another happens. Inductive regularity: causal relations consist only in patterns of regular associations between things; one thing is typically associated with another. Must be able in stat analysis to explain why numbers turn out the way they do (to interpret results and relate back to real world) Data involve a domain of items, events, individual or other objects, including properties of these items and their states. Role of stat explanation - empirical evaluation of causal hypotheses, preliminary method for probing complex range of comm phenomena for regularities.