PSYA4: Research Methods


The major features of science

Falsification:  (hypothesis testing) generate testable predictions in form of hypothesis - can be tested for validity. Can only be proven through seeking disproof. Failure to find support = requires modification. (Admits possibility of being disproved)

  • Popper: start with null hypothesis (seek disproof), when reasonably certain of validity reject null and accept alternative hypothesis.

Objectivity: not affected by expectations of researcher - systematic collection of measurable data (controlled conditions). Without this - no way of being certain data is valid.

Replication: careful recording and standardisation of procedures = can be repeated by other researchers. Validity affirmed by exact replication and same outcomes.

Control and manipulation of variables: other variables controlled (IV and DV can be checked - causal relationship and not due to EVs)

Empirical methods: info. gained through direct observation/experiment not reasoned argument/unfounded beliefs - only way we know things are true

Theory construction: make sense of/use facts - understand and predict natural phenomena

1 of 43

The scientific process

Induction (bottom-up): reasoning from the particular to the general

  • Observation (of specific) > testable hypothesis > study to test > conclusions > propose theory (about general population - assume examples are representative) 

Deduction (top-down): reasoning from general to particular

  • Observation (general population) > propose theory (from known facts) > testable hypothesis > study to test > conclusions/confirmation (about specific)

Popper: Hypothetico-deductive model of science: a deductive method is the best (proposing theory and seeking evidence to support/contradict) - can seek falsification (shows tested propoerly). Should actively search for ways to disprove theory: too easy = altered and retested, too difficult = good theory.

Kuhn: idealised view of science. In the real-world science progresses differently. Even when falsified science clings to it = a paradigm (pattern/model)

2 of 43


Can psychology claim to be scientific?

Scientific research is desirable: enables psychology to produce verifiable knowledge about behaviour - distinct from commonsense 'armchair' psychology

Psychology uses the scientific method: most models can be falsified, well-controlled experiments.. BUT Miller: simply using the scientific method may be no more than 'dressing up' (using technical language but not engaged in real scientific research = pseudoscience)

Lacks objectivity and control: object of study reacts to the researcher (experimenter bias and demand characteristics) - compromises validity. BUT similar probles for hard sciences (e.g. not possible to measure a subatomic particle without altering its 'behaviour')

Are the goals of science appropriate for psychology?

Laing: inappropriate to view a person experiencing distress as a physical-chemical system gone wrong (e.g. schizophrenia). Should be treated as an individual case (idiographic approach). Science uses a nomothetic approach (make generalisations)

Psychological approaches to treating metnal illness based on sci. principles - modest success

3 of 43


Qualitative research advocated by some (more subjective) - still 'scientific' (can be validated): data can be collected from such methods and triangulated (compared with findings from other methods) = obective

Reductionist: complex phenomona reduced to simple variables to study causal relationships.

  • Dev. of theories: occam's razor/canon of parsimony (simler theory preferred)

Determinist in its search for causal relationships

Reductionism and determinism = mixed blessing (difficult to pick out patterns or reach conclusions without reductionism + gain insights into important factors without determinism)

4 of 43

Validating new knowledge - peer review

(Assesmment of scientific work by others who are experts in the same field)

  • Essential part of research process: any flawed or fraudulent - detected and results ignored - ensures high quality)

Main purposes (the Parliamentary Office of Science and Technology):

1. Allocation of research funding (enables decisions as to which research is likely to be worthwhile - government and charitable bodies, e.g. Medical Research Council (MRC))

2. Publication of research in scientific/scholarly journals and books (preventing incorrect/fault data entering public domain - burden of proof now with believers)

3. Assessing research rating of university departments (future funding depends on rating)

 Peer review and the internet: (more egalitarian system - 'peer' = 'everyone')

  • New solutions needed to maintain quality (availability of sci. info.)
  • Mostly policed by 'wisdom crowds' (readers decide validity), some journals ask readers to rate 
5 of 43

Validating new knowledge - peer review cont'd

Examples of fraudulent research:

The Cyril Burt affair: suspicious consistency in evidence of inherited intelligence (.771 correlation)

Professor Marc Hauser: unable to provide evidence for his conclusions (research of cotton-top tamarin monkeys and cognitive abilities)

  • Without - don't know what is opinion and speculation rather than real fact
  • Smith et al: slow, expensive, subjective, prone to bias, useless at detecting fraud
  • Isn't always possible to find an appropriate expert to review report or proposal - reviewer may not understand it and pass it
  • Anonymity - can be honest and objective OR settle old scores/bury rival research (competitive world, friends and enemies). Some journals now favour open reviewing.
  • Publication bias: journals prefer to publish +ve results (improve standing - important implications) - avoid straight replications (as bad as newspapers - eye-catching stories)
  • Results in preference for research that goes with existing theory (science resistent to large shifts in opinion - changes slow, peer review slows it down)
  • Once study published - results remain in public view even if shown to be fraudulent/result of poor practice, e.g. some still used in debate in parliament
6 of 43

Research methods and concepts: experiments

Experiments (involve IV and DV - causal relationship. Other variables controlled)

Lab (IV manipulated, controlled, artificial env): high control = high internal validity and can identify correlational r'ship (effect of IV on DV), standardised procedures (reliability) low EV + MR (external validity), DCs

Field (IV manipulated, controlled, more natural env): causal /correlational expl. can be demonstrated (possible control of EVs), greater EV (real-life), reduced experimenter effect and DCs (not aware being studied), control more difficult (reduce In.V), DCs (way IV operationalised - exp. hypothesis), diff. to replicate (reliability), ethical issues (informed consent, deception)

Natural (natural env, existing naturally-occurring IVs - not manipulated, not randomly allocated): only way to study certain behaviours (unethical to manipulate), where impossible/diff. to manipulate IVs, high EV, low control (cannot allocate - indiv. differences - reduced ex.validity), cannot prove causal r'ship

7 of 43

Experimental design

Several levels of the IV - whether P's tested on all or just one level depends on design:

Repeated measures: all P's receive the same IV/do all conditions

  • Fewer P's needed (saves time and money) AND no effect of participant variables/indiv. differences (same P's both conditions)
  • Increased likelihood of DCs (see exp. task more than once) - Dealing with it: single-blind design
  • Order effects (practice/less anxious/boredom) - Dealing with it: counterbalancing

Independent groups: different p's do each of the conditions, performance compared

  • No order effects (diff. p's) AND less likely to guess aims (less DCs)
  • Participant variables (indiv.differences) - Dealing with it: random allocation
  • Twice as many P's (costly, time-consuming)

Matched pairs: 2 groups, matched based on key variables (relevant), one in each condition

  • Reduced likelihood of DCs AND no order effects AND reduced effect of participant variables
  • May not control all participant variables (others could be important) - Dealing with it: restrict matching variables
  • Matching = time=consuming and difficult (need to start with large group to ensure can match) - Dealing with it: pilot study (consider key variables)
8 of 43

Dealing with problems of experimental designs

1. Counterbalancing: reduce order effects

  • Each condition tested first or second in equal amounts (ABBA/ABAB)

2. Random allocation: (allocating to experimental groups/conditions based on random techniques) - reduce effects of participant variables

3. Single-blind technique: (P not aware of aims/condition they are receiving - cover story) - reduces DCs

4. Double-blind technique: (neither P nor researcher aware of condition indiv. p's receiving) - reduces invetigator effeects and DCs

9 of 43

Non-experimental methods: observation

  • Naturalistic: natural environment, IV naturally occurring, not manipulated, unstructured (uncontrolled) high EV, less affected by DCs (more valid than interviews/questionnaires), can study where variables cannot be manipulated, low control, observer bias (low inter-observar rel.), misinterpretations, covert = unethical, difficult to arrange and costly
  • Controlled: controlled conditions (IV manipulated - P's know they're being studied/lab), structured (controlled) high control, can manipulate to observe effects, less likely affecred by DCs, DCs, observer bias, investigator and participant effects, low EV


  • Disclosed/overt (increases likelihood of DCs) vs. non-disclosed/covert (unethical, e.g. one-way mirrors)
  • Participant (likely to be objective) vs. non-participant
  • Structured/focused (systems to organise observations of specific behaviour) vs. unstructured/non-focused (records all relevant behaviours in much detail as possible, no system if behaviour unpredictable, continuous observation - too much to record, may not be most important, usually most visible)
  • Direct (researcher is the observer) vs. indirect (observations of people through artefacts they produce) - content analysis
10 of 43

Non-exp: Observations cont'd (content analysis)

Example of indirect observation: (observing people directly but through the artefacts they produce)

  • Sampling method (frequency of material and time - event/time, which channels etc.)
  • Method of recording data: behavioural categories/coding systems
  • Method of representing data: quantitative/qualitative

Inter-observer reliability important


High EV (observations of what people do - real communications, current an relevant)

Can be replicated (sources retained/accessed by others) = reliability

Observer bias (reduces objectivity and validity)

Cultural bias (interpretations affected by language and culture)

11 of 43

Non-exp.: Observations cont'd - structured


  • Research aims: area to study (may also apply to unstructured)
  • Observational systems
    • Behavioural categories: operationalisation of variables - behaviour divided into subset of behaviours = specific and measurable:
      • Behaviour checklist
      • Coding system (indiv. behaviours given codes - easily recorded)
      • Rating scale (list - observers rate each behaviour/characteristic)
      • Should be: objective (record explicit actions), cover all possible elements and don't disregard any, mutually exclusive (not everlap)
    • Sampling procedures: (when continuous obs. not possible)
      • Event sampling: (counting no. of times certain behaviour occurs = quantitative)
      • Time sampling: (recording behaviours in given time frame/certain intervals = qualitative)
12 of 43

Evaluating observational techniques:

Flawed coding system affects reliability and validity (behaviours belong in more than one category/none)


  • External: high EV (natural behaviour) BUT naturalness doesn't mean EV AND may lack PV
  • Internal: observer bias (investigator effects) AND DCs
  • Improving validity:
    • Variety of settings and Ps
    • More than one observer - balance out bias


  • Inter-observer: (dividing total agreements by total no. of observations) - result of +.80 or more = good
  • Record events and watch again
  • Pilot study - check behavioural categories and make amendments
  • Improving reliability: training in use of coding system/behaviour checklist (also improves validity)

More valid than e.g. self-report (what people say they do different from what they do) + naturalistic = more realistic picture of spontaneous behaviour + means of conducting preliminary infestigations (new areas) = hypotheses

Ethical committees (to approve observational designs), informed consent, public places

13 of 43

Non-experimental: self-report

Writing good questions:

  • Clarity (understandable - no ambiguity)
  • Bias (no leading questions, social desirability bias)
  • Analysis (designed with this in mind - answers need to be summarised so conclusions can be drawn):

Types of questions:

  • Closed: limited range of possible answers. Quantitative data, forced to select answers that don't represent their thoughts/behaviour
  • Open: potentially infinite set of answers (freedom for longer, detailed answers). Unexpected answers, rich detail, new insights, qualitative = diff. to summarise and find patterns (can be turned into quantitative through categories)

Good questionnaires/interviews:

  • Sequence for questions(start with easy and finish with difficult when relaxed - make P's anxious/defensive) 
  • Sampling technique
  • Piloting (tested on small group - refine and amend questions)
14 of 43

Non-experimental: self-report - questionnaires

Self-report method using written questions. Structured.

  • Can collect same info. from a large number of peple quickly and easily (time-and cost-efficient) - large amount of data - can repeat
  • Easy to analyse to get quantitative data from closed questions
  • Easy to replicate (check reliability)
  • Respondents may be more truthful compared to being interviewed face-to-face (esp. personal/confidential info.)
  • Can access what people actually think (not relying on guessing) = valid
  • Do not require specialist administrators
  • Reponse bias (e.g. always choosing certain answer/answer not truthful - leading Qs/SDB) BUT leading Qs less of a problem than in unstructured interviews
  • No flexibility (can't add new Qs to collect useful, unexpected data)
  • Return rates may be low and samples biased (only certain types of people answer questionnaires - literate, willing to spend time)
15 of 43

Non-experimental: self-report - interviews

Questions given by an interviewer (face-to-face/over the phone/via computer)

  • For some, questionnaires are difficult (e.g. children/or people with writing difficulties)
  • Interviewers can deal with ambiguous questions/uncertainty - explain
  • Interviewer bias - way Qs asked, e.g. emphasis of words/verbal or visual cues (reduce validity - affect answer/interpretations) - SDB
  • May feel less comfortable about revealing personal information
  • Low inter-rater reliability (ask Qs differently/same differently in different situations)

Structured:(pre-determined, fixed questions)

  • Easy to analyse - quantitative AND easy to repeat (standardised Qs)
  • Limited by fixed questions

Unstructured:(questions asked based on interviewee's response) and Semi-structured:(combination of structured and unstructured (pre-determined Qs, further Qs asked in reponse) = clinical interview. 

  • More detailed information AND can access info. not revealed by fixed questions (experienced interviewer can elicit more extensive informaton - gentle questionning techniques)
  • Interviewer bias (more than in structured) - new Qs may be objective AND well-trained interviewer needed (reliable results) = expensive AND more difficult to analyse data (unpredictable)
16 of 43

Non-experimental: validity

of self-report and psychological tests (IQ/personality tests, mood/attitude/aptitude scales)

  • SDB (participant effect)
  • Interviewer bias
  • Leading questions (poorly designed - meaningless)
  • Content validity (measuring what they intend to)

Assessing validity:

  • External:
    • Concurrent validity (comparing existing test with one your interested in - signif. +ve correlation = concurrent validity)
    • Face validity (looks like it measures what it intends to - obviously reated to topic)
    • Predictive validity (correlating results with later example of behaviour being tested - +ve correlation = validity)
  • Internal:
    • Construct validity (extent to which performance measures identified underlying construct e.g. theoretical views)
    • Lie scale (questions that detect truthfulness - lying on high proportion of lie scale items = may not be truthful)

Low internal validity: items need to be revised - produce better match between scores on new test and established one

Ceiling effect (all questions easy - do well), Floor effect (all Qs difficult - to badly)

17 of 43

Non-experimental: correlational analysis

  • Investigate link between two co-variables (measured variables)
  • Less strong than causal r'ship
  • Explore link prior to experimental research/impractical or unethical to use experiment to manipulate variables
  • Illustrated using scattergraphs:
    • Negative correlation = increase in one leads to decrease in another at same rate (perfect = -1.0)
    • Positive correlation = both increase at same rate (perfect = +1.0)
    • No correlation: no relationship (CC = 0)
      • Correlation coefficient: describes closeness of correlation (close rto 1 or -1 = stronger, closer to 0 = weaker)

Curvilinear correlations:curved relationship (e.g. performance lowered when anxiety too high/low, best when moderate) = Yerkes-Dodson effect

Correlational hypothesis: no IV or DV - states expected relationship between variables (e.g. a and b are positively correlated)

When unethical/impractical to manipulate variables - make use of existing data AND correlation signif. = further investigation justified AND insignif. = causal r'ship ruled out  AND procedures can be repeated - confirm

Often misinterpreted - cause and effect assumed (not possible) AND unknown intervening variables explain link AND lack internal and external validity (method used/sample)

18 of 43

Non-experimental: case studies

Detailed study of one person/group/event (often unique/unusual, can be normal) over short/long period of time

Data collected from case being studies/others involved - likely to be qualitative (can be quant.)

How they are conducted: social (observation, interviews, questionnaires)/cognitive(ability tests, brain scans)

  • Rich, in-depth data - detailed info. (more than can be obtained/would be overlooked through experiments)
  • Rare cases - opp. to study situations that it would be unethical to produce
  • Complex interaction of many factors (not held constant) - holistc approach (humanistic psychologists)
  • Focus on unique individuals/situations - may not generalise
  • Variables cannot be controlled (causal relationships cannot be investigated)
  • Take a long time for collection and analysis (money and effort)
  • Necessary to use recollection of past events - may be unreliable (memories)
  • May lack objectivity (get to know case - theoretical biases)
  • Ethical issues: confidentiality and anonymity
19 of 43

Non-experimental: other methods

Investigations: doesn't fit into any of the categories

The multi-method approach: combination of different techniques and methods

Role play: P's take on a certain role and behaviour observed as if everyday life (form of controlled-observation), e.g. Zimbardo's Stanford Prison Experiment (Otherwise impractical/unethical BUT would they really act how they would in real life (personal principles vs. social norms))

Cross-cultural studies: compare behaviours in different cultures (whether cultural practices affect behaviour) - kind of natural experiment (universality BUT observer bias AND communication difficulties (indigenous researchers) AND imposed etic AND group may not be representative of culture)

Meta-analysis: combines results of several studies addressing similar aims/hypotheses (effect size used as DV) (more reliable conclusions BUT research designs in diff.studies may vary - not comparable = not valid)

Longitudinal studies: long time, long-term effects (control of participant variables BUT attrition - some drop out = biased sample/too small AND likely to become aware of aims (like repeated measures) AND time-consuming and costly AND cohort effect - unique characteristics)

Cross-sectional/snap-shot study: one group compared to another (e.g. young to old) - investigate influence of IV (relatively quick BUT groups may differ in more ways than behaviour being researched - participant variables (like independent groups) AND cohort effect)

20 of 43

Ethical issues: human participants (BPS guidelines

  • Conflict: what the researcher wants and rights of P's (what is acceptable)
  • Socially sensitive research (potential social consequences/implications - discriminatory social policies)

Respect (dignity and worth)

  • Informed consent (comprehensive info. - nature and purpose of study and their role)
    • P's: what they're letting themselves in for AND to make informed decision
    • R's: reduce meaningfulness (affect behaviour)
  • Deception (not told true aims/misled/some info withheld) - only acceptive - integrity of research and if disclosed earliest opportunity)
    • P's: prevents from truly giving informed consent AND honesty important
    • R's: some harmless (debriefing) AND embarrassment/harm AND diff. withholding and lying
  • Privacy (right to control info. about selves, confidentiality and anonymity should be respected)
    • P's: right to decide who knows/doesn't know about personal info.
    • R's: May be difficult AND affect behaviour
21 of 43

Ethical issues: human P's cont'd

Confidentiality/anonymity (communication of personal info.)

  • P's: Data Protection Act: legal right (no-one should be able to connect them with what they do)
  • R's: May not be possible (unique features)

RTW (right to know they can leave at any time - feel comfortable doing so, and withdraw data)

  • P's: Esp. if some info. withheld at start/didn't understand AND may feel will spoil exp. AND if paid or rewarded
  • R's: May bias findings (may be more confident/intelligent)

2. Competence (maintain high standards)

3. Responsibility

  • Protection from harm (-ve physical/psychological effects)
    • P's: expect to leave in same state as at start AND reasonable - expose to everday risks
    • R's: may not be possible to estimate all possible -ve effects before

4. Integrity (honest and accurate - possible limitations, instances of misconduct)

22 of 43

Ethical issues: non-human animals

Animal Rights and Scientific Procedures Act (1986)

Scientific arguments (reasons for use): fascinating (may benefit), greater control and objectivity, unethical to use humans, common physiology and evol. past (not the same), simpler behaviour (less influenced by cogn. factors), shorter lifespand (easier to observe dev. processes)

Influence of emotion, social context, cognition - human behaviour  AND stress in labs(meaningless)

Moral justification/ethical arguments:

  • Sentient beings (respond to pain - not same as conscious awareness, some non-primates have s-a AND some humans, e.g. brain-damaged, lack sentience, difficult to judge pain and emotion)
  • Spieciesism (Singer - no different from racial/gender discrimination, Gray: we have a special duty to the care of humans = speciesism not equivalent to racism)
  • Animal rights:
    • Utilitatian view: whatever produces the greater good for the greater number = ethically acceptable (animal research alleviates pain and suffering = justifiable)
    • Regan: no circumstances is it acceptable (right - respect) (rights depend on responsibilities in societies - animals do not have these) - better distinguish: rights and obligations (humanely, aware of animal sentience)
23 of 43

Ethical issues: non-human animals

Existing constraints:

  • Laws: (Animals (Scientific Procedures) Act, 1986) - can only take place at licensed labs w. licenced researchers on licensed projects. Granted if:
    • Potential results important enough to justify use
    • Research cannot be done using non-animal methods
    • Min. number used
    • Any discomfort or suffering kept to a minimum (anaesthetics/painkillers)
  • Guidelines: (BPS)
    • Confinement, restraint, stress and harm (minimised)
    • Species (differnet species considered - pain and discomfort and procedures considered)
    • Number of animals used (smallest possible)
    • Caging and social env. (appropriate - avoid distress)
    • Deprivation (food and water)
    • Wild animals (disturbance minimised - endangered only used in conservation research)
  • The 3 Rs (House of Lords): Reduction (number), Replacement (alt. methods), Refinement (improved techniques to reduce stress)
24 of 43

Dealing with ethical issues

(Informed consent, deception, RTW, confidentiality, protection from harm, anonymity, privacy)

1. Ethical guidelines: (code of ethics)

  • UK: BPS - monitors behaviour of professional groups, maintains ethical standards - publication of code of ethics (USA: American Psychological Association, APA) - absolves indiv. researcher of responsibility
  • Scientific value: designed, reviewed, conducted - ensures quality, integrity and contribution (dev. of knowledge and understanding) - poor design can cause harm
  • On informing p's: at least one pilot study (informing and debriefing on naive person - lower literacy level)
  • Who can give consent: means appropraite to age and competence

2. Ethical committees/ institution review board:

  • Approve study before it begins (look at poss. ethical issues and how they are planned to be dealt with - value vs. costs - sometimes reasonable)

3. Cost-benefit analysis: balancing consequences against potential to produce meaninful findings (enhance human lives) - decisions subjective, costs not always apparent until afterwards, diff. to quantify costs and benefits (e.g. personal distress

Special techniques: debriefing (opp. to assess effects, offer counselling, explain true aims, offer RTW, find out more about P's - cannot turn the clock back - self-reassurance), presumtive consent (whether others think its acceptable - what people say they would/wouldn't do is different from experiencing it)

25 of 43

Internal validity - Extraneous variables

VALIDITY:1.  identify the EV, 2. explain impact, 3. assess  seriousness/likelihood of impact

INSIDE the study: degree to which observed effect due to experimental manipulation not other factors (E/CVs) (tests what intended to) - Confounding variable = specific EV that affects the DV
Threats: participant awareness, experimental control, observational coding system flawed,operationalisation of variables

1. Participant effects/reactivity: p's react to cues in the exp. situation - may affect validity - aware that behaviour being studied, change behaviour as a result (actively seek cues - want to help OR unsure how to behave)

  • Demand characteristics (features of experiment elicit PEs) - diff. situations diff. behaviours. Convey exp. hypothesis. The Hawthorne effect (extra attention), social desirability bias(socially acceptable)
  • Dealing with participant effects:
    • Single blind design, double blind design, experimental realism

2. Participant variables: characteristcs of indiv. p's that may infl. outcome (only if indep. groups design used)

  • Age, intelligence, motivation and experience (p's in one condition may have certain sim. characteristics - do well)
  • Gender (psychologically different due to socialisation - women more compliant - only EV in certain circumstances - important to control where reason to expect it would matter)
  • Irrelevant participant variables
  • Dealing with participant variables: repeated measures/matched pairs design, larger sample, random allocation
26 of 43

Internal validity - Extraneous variables cont'd

3. Investigator/experimenter effects: cues other than the IV from an investigator/experimenter - encourages certain behaviour - lead to fulfilment of expectations = E/CV

  • Experimenter/investigator bias: (effects of experimenter's expectations on P's behaviour)
    • Experimenter (conducts), investigator (designs/all-purpose)
  • Direct effects: unintentional cues, personal attributes, leading Qs. Due to INTERACTION.
  • Indirect effects: investigator exp. design effect (operationalisation of variables), investigator loose procedure effect (not clearly specifying standardised instructions/procedures - results can be influenced by experiminter), investigator/experimenter fudging effect (inventing extra data), experimenter personal attributes effect (liking men over women). Due to DESIGNING study.
  • Dealing with: double blind design

4. Situational variables: features of research situation - may infl. behaviour

  • Time of date, temp, noise (any env. variable at time of testing that affects performance)
  • Order effects (practice and boredom. fatigue)
  • Investigator effects (any cues that encourage certain behaviour = fulfilment of expectations - e.g. way they ask a question/respond/instructions)
  • Demand characteristics (features of exp. situation that elicit participant effects)
  • Dealing with: counterbalancing/using independent groups design, control env. factors, standardised procedures (incl. instructions), double blind design
27 of 43

External validity

Ability to generalise findings from experiment  to other situations/people Whether research has:

  • Population validity, historical validity, ecological validity

Affected by the REPRESENTATIVENESS of the sample

Ecological validity: extent to which results of a research study can be generalised to real-life

  • Mundane realism (whether it mirrors real-world/everday experiences - tasks realistic for p's)
  • Generalisability (can be generalised from env. conditions created to others)

Not all research conducted in a natural environment is automatically ecologically valid/high in MR (depends on the circumstances inv. and the task)

All lab studies are not automatically ecologically invalid

Depends on the mundane realism of the task involved

28 of 43


  • Experimental research: Ability to reapeat and obtain same results = replication. All conditions must be the same (change in result may be due to changed conditions).
  • Observational techniques: observations should be consistent (2 or more observers produce same record) - coding system/behaviour checklist
    • Inter-observer reliability (total agreements/total no. of observations) 0.80 + = good.
  • Self-report:
    • External: consistency of results over several diff. occasions - measuring same thing w. same method (inter-interviewer reliability)
    • Internal: consistent in itself (all Qs measuring same thing)
  • Improving/assessing reliability:
    • Experimental studies: standardised procedures - control of EVs
    • Observatonal studies: inter-observer reliability, behavioural categories
    • Self-report: inter-interviewer reliability,
      • Internal: split-half method (random selection of half of test items, scores for both parts, using CCs, should be consistent)
      • test-retest (repeate same e.g. questionnaire with same respondent on 2 occasions) - interval long enough so they cannot remember answers, not too long (thoughts etc. change)
    • Interviewers need better training (poor = lack consistency in way they ask Qs), test items should not be ambiguous
29 of 43

Selection of participants (sampling methods)

Target population > representative group of P's (sample) > generalise results

  • Opportunity: selecting those who are willing and available at the time and place. Easiest and quickest method (use P's you can find) BUT biased (some of the population excluded) and experimenter's unconscious bias = unrepresentative.
  • Volunteer/self-selection: asking for volunteers who select themselves to participate. Access to a variety of people, quick and easy, good way to get a specialised sample (purposive sampling) BUT biased (certain type of people - keen, helpful, inquisitive = volunteer bias)
  • Random: every member of the target population has an equal chance of selection. Potentially unbiased BUT large chance of bias (choosing certain types of people), takes more time and effort (need list of members of TP, identify sample, contact those identified)
  • Systematic: selecting every nth person (predetermined system). Unbiased (objective system used), BUT not truly unbiased/random (unless selecting using random method then selecting every nth person after the first)
  • Stratified: selected according to frequency in population. Subgroups (strata) identified - p's obtained from each in proportion to occurrence in TP. Selection = random. More representative than other sampling techniques (equal representation), BUT very time-consuming (strata identification and calculation - need details of TP
  • Quota: same as stratified BUT sample not randomly selected from categories (selection done by another method, e.g. opportunity sampling) Each quota may be biased due to the non-random sampling method - only have access to certain sections of the TP
  • Snowball: start with one or two people - put researcher in contact with others suitable for research. Useful in research where difficult to identify/contact P's, simply and cheap, BUT prone to bias - may know each other well (similar) and little idea of true distribution of population
30 of 43

Probability and significance

  • Probability (p): numerical measure of likelihood/chance certain events will occur (pattern due to chance or is there a real effect?)
  • Significance: statistical term - set of research findings sufficiently strong for research hypothesis under test to be accepted (small difference expected = random variation)
  • Null hypothesis (H0): no correlation/difference, Alternative hypothesis(H1): there is a difference

Significance level: (level of probability (p) at which it has been agreed to reject  the null hypothesis)

  • 100% certainy (not due to chance) unrealistic - acceptable cut off point set at p less than/=0.05 /95% (5% chance of data not showing a real difference/relationship - occurred by chance = null hypothesis)
  • Some research: stricter level of significance may be needed (e.g. untried drugs/replicating previous study) - level could be set as p 0.01 or p 0.001

* Greater number of p's - lower value needed for significance

31 of 43

Probability and significance cont'd

Type 1 and type 2 errors:

One reason for use of 5% significance level: good compromise between being too stringent (type 1 error) and too lentient (type 2 error)

  • Type 1: difference/relationship accepted as real and true null hypothesis rejected (signif. level set too high/lenient) = false positive
  • Type 2: difference/r'ship accepted as being insignificant - null hypothesis wrongly accepted (signif. level set too low/stringent - no real difference in the TP) = false negative

Reducing chance of making type 1 error or type 2 error: increase sample size (bigger=less likely)

Increased risk of type 1 errors = reduced chance of type 2 errors (and vice versa)

32 of 43

Inferential tests

Drawing conclusions: help to draw inferences about TP based on sample tested - allows it to be inferred that pattern likely/unlikely to be due to chance - make sense of findings to explain human behaviour (what does it show about human behaviour in general)

  • Descriptive statistics (can be used when considering concl. - summary of data: general patterns and trends BUT cannot assume sample is same as population)

Observed values:

  • Each I.T. inv. taking collected data and doing calculations - produce single number (test statistic/calculated value/observed value)
    • rho (Spearman's Rank Order Correlation Coefficient)
    • U (Mann-Whitney test)
    • T (Wilcoxon Signed Ranks Test)
    • x^2 (Chi-Squared Test)
33 of 43

Inferential tests cont'd

Critical values: (O.V. compared to a C.V. to test significance - much reach to reject H0).
4 pieces of info. needed to find the appropriate C.V.:

1. Degrees of freedom (df): usually the number of p's (N) -- except Chi-squared = (rows-1)x(columns-1)
2. One-tailed (directional hypoth.)/two-tailed (non-directional hypoth.) test
3. Significance level selected: usually p</=0.05 (5%)
4. Whether O.V. needs to be greater or equal to/less than or equal to C.V. for signif. to be shown (written under table)

Levels of measurement: (NOIR)

  • Nominal: in separate categories (e.g. grouping people based on height) and ocunting how many are in each = frequency data
  • Ordinal: ordered in some way (e.g. in order of size/ranking) - 'difference' between each is not the same
  • Interval:measured using units of equal intervals (e.g. height in cm/counting correct answers) = more precise
  • Ratio: same as interval but ratio scale has a true zero point (e.g. Celsius temp. scale is NOT ratio, 24-hr time and age ARE) - data can be compared as multiples of each other

* Rating scales (e.g. Likert): not true interval (diff. not the same - arbitrarily determined intervals))/ordinal  = plastic interval scale

34 of 43

Inferential tests cont'd: choosing + justifying

  • Spearman's rho test: test of correlation/association, related data (pairs of data from one person/thing - RM/MP), ordinal(/interval) data
  • Chi-squared: test of difference (conditions)/association (variables), independent groups/unrelated data, nominal data (frequencies - not in %)
    • Unreliable when expected frequencies (how often an often may be presumed to occur on average in a given number of trials) below 5 in any cell (need at least 20 p's for a 2x2 contingency table)
  • Wilcoxon T test: difference (two sets of data), related data (repeated measures/matched pair), ordinal(/interval) data
  • Mann-Whitney U test: difference (2 sets of data), independent groups, ordinal/interval data

* Adapt justifications to suit the particular research study

1. Identify level of measurement (referencing data)
2. Test of correlation or difference (justify)
3. If difference: independent groups or repeated measures (justify)

Stating conclusions (features):

  • O.V. (x) is greater/less than the C.V. (y) for p ____,one-tailed/two-tailed test. We can reject/accpet null hypothesis (restate hypothesis you are accepting)
    • Spearman and chi-square: O.V. must be greater than/=C.V.; Wilcoxon and Mann-Whitney: less than/= C.V.
35 of 43

Analysis and presentation: descr. stats

  • Graphs display data in a pictorial fashion (easily understandable alternative to numerical data)
  • Should be titled and each axis labelled: vertical (y-axis) - represents DV (3/4 of width of x-axis), horizontal (x-axis) - repressents the IV
  • Bar charts:
    • Nominal data (categories)
    • Columns on x-axis (all same width and separated by space) - discrete data: limited set of distinct values
  • Histograms:
    • No spaces between columns (continuous data - unlimited possible values = interval/ordinal)
    • Construct equal-sized categories for data (e.g. 0-9, 10-19...)
    • Column width is equal
  • Frequency polygons:
    • Continuous (interval or ordinal data) - like histogram
    • Line used to join the mid-point of each class interval
    • Allows two or more frequencies to be compared on same graph
  • Scattergraphs:
    • Allow representation of correlation between two co-variables
    • Can display +ve, -ve or no correlations
36 of 43

Descriptive statistics

Measures of central tendency: (inform about central/middle values - average - for data set)

  • Mean (+ all and / by number) Makes use of values of all of the data, extreme values (anomalies) = unrepresentative of data as a whole, cannot be used for nominal data
  • Median (mid. value in ordered list) not affected by extreme scores, doesn't reflect all of the values (not as sensitive as mean)
  • Mode useful for nominal data, not useful with ordinal or when several modes
  • *Symmetrically distrubuted data (use any measure), skewed (use median)

Measures of dispersion: (inform about spread of data - how far data is from midpoint)

  • Range easy to calculate, direct info. with little calculation, may be affected by extreme values, doesn't take into account the size of the set of values
  • Standard deviation (measure of variation in data set and spread around the mean) more precise (all values taken into account), affected by extreme values that increase SD = less representative of data set
    • Smaller SD = more useful data
    • Calculate mean + see how close data is to the mean
37 of 43

Comparing qualitative and quantitative data


  • Represents true complexities of human behaviour AND gains access to thoughts and feelings
  • More difficult to detect patterns, summarise and draw conclusions AND subjective analysis - personal expectations, beliefs (quantitative methods only seem to be objective - may be equally biased)


  • Easier to analyse (numerical), summarised readily (measures of CT and dispersion, use of graphs) AND produces neat conclusions
  • Oversimplifies reality and human experiences (statistically signif. but humanly insignif.)
38 of 43

Qualitative research design and analysis

  • Broad aims (view phen. as a whole, record 'reality' from perspecrive of p's)
  • Research question (focused questions, but hypotheses avoided - may bias observations)
  • Methods: unstructured/semi-structured interview (but also structured), direct observation (naturalistic, PO and disclosed = ethnographic - study of cultural group in natural context, sustained period), indirect observation (artefacts), case studies, diaries (p's and r's keep record), focus groups (shared interest etc.)
  • Sampling: more than one person/event, smaller samples than for quantitative (5-15), diversity, purposive sampling (who would be appropriate - when limited number suitable)
  • Recording data: full record - all material and peripheral activity (recording, transcribed, paralinguistics, non-linguistic, record of own feelings etc.)

Denies existence of one real world (trad. approach in psych.) - view phen. from perspectives of those who experience it

Subjective bias = inevitable (must be recognised instead of being minimised/removed - dealt with by being ackowledged as part of the research itself = improves scientific nature)

  • Reflexivity: extent to which process of research reflects researcher's values and thoughts (qual. analysis depends on r's perceptions as well as those of people who are experiencing the phenomena)

Demonstrating validity through using triangulation (compare results from variety of different studies of same thing/person using diff. methodologies)

39 of 43

Qual. research design and analysis

Thematic analysis (themes/categories identified before starting research, responses organised accordingly)

  • Difficult to summarise -  summarised by identifying repeated themes
  • Iterative approach: very lengthy process (painstaking and iterative - every item carefully considered, data gone through repeatedly)
  • Main intentions:
    • Impose some kind of order
    • Ensure this represents p's perspective
    • Bottom-up approach: order emerged from data not preconceptions
    • Summarise data - many pages/hours of data can be reduced
    • Enable themes to be identified + general conclusions drawn
  • Inductive/'bottom up' approach (most qual. analyis aim to use this) - categories/themes emerge based on data - may lead to new theories ='emergent theory'
  • Deductive/'top down' (less common) - preset categories/themes generated from previous studies/theories - see if data consistent with previous views
40 of 43

Qualitative analysis cont'd

Thematic analysis: general principles

1. Read and reread transcript (understand meaning etc.), 2. Break data into meaningful units (can independently convey meaning), 3. Label/code each unit(inital categories, 'top down' - provided by existing theories), 4. Combine simple codes into larger categories/themes, 5. Check emergent categories (does new set of data fit?), 6. Final report (discuss and use quotes etc. - illustrate emergent categories), 7. Conclusions drawn (may incl. new theories)

Quantitative outcomes:

Results can be reduced to quantitative form, e.g. content analysis

Categories commonly selected and frequency of events in each category counted

Analysis = qualitative (meanings retained)

Thematic analysis: categories may emerge from the data or from existing expectations

41 of 43

Designing a study

Experiment/study using correlational analysis:

1. Aims/questions (incl. TP), 2. IV and DV, 3. Kind of experiment and design, 4. Hypothesis, 5. Selection of P's, 6. Ethical issues, 7. Problems with validity, 8. Standardised procedures, 9. Pilot study, 10. Consider statistics to use


1. Aims/questionsm 2. Design of questionnaire/interview, 3. Pilot study, 4. Selection of P's, 5. Ethical issues, 6. Problems with validity, 7. Problems with reliability, 8. SPs, 9. Statistics to use

Observation/content analysis:

1. Aims/questions, 2. Kinds of observation, 3. Materials (behaviour checklist etc), 4. Pilot, 5. Selection of p's/observations, 6. Ethical issues, 7. Problems with validity, 8. Reliability, 9. SPs, 10. Statistics to use

42 of 43

Reporting a study

  • Title (readers know what report is about, decide whether or not to read it - say as much as possible about content in as few words as possible)
  • Abstract (summary - covering aims, hypothesis, method, results, conclusions - incl. implications - can decide whether full report is worth reading)
  • Introduction and aims/hypothesis: review past research and findings (reasons for study), general to focused, decide on aims/hypotheses, directional/non-directional
  • Method: (detailed description of procedures - enough info. so can replicate)
    • Design and materials (R/V, Experiments: research design/Questionnaires/interviews: type (incl.Q type)/Obs. types), P's (suitable sampling technique): size, composition, details, if IGs how assigned, Ethical issues and how to deal with them, Procedure (replicability, standardised instructions, testing env., order of events..)
  • Results: incl. descr. statistics (tables, graphs, measures of CT and disp.) and infernetial stats (justified, OV, signif. level) - statement of whether null hypoth. rejected/accepted
  • Discussion: (interpret results, consider implications for future research, suggesting RWAs
    • Summarise results (incase reader skipped/to remind), r'ship to previous research, consideration of method (criticisms/improvements, implications for psych. theory/RWAs
  • References: full detail of journal articles/books
  • Appendices: incl. detailed info. that would be distracting in main body (e.g. stats, questionnaires used) - enables reader not to read them if they choose
43 of 43


No comments have yet been made

Similar Psychology resources:

See all Psychology resources »See all Research methods and techniques resources »