-

-
What is the distribution of disorder types across the events?
-
How many events occurred in each location mentioned in the dataset?
-
What is the breakdown of event types and sub-event types?
-
Are there any patterns in the geographical distribution of events (using latitude and longitude)?
-
What is the most common actor type involved in these events?
-
Is there a correlation between the type of event and the crowd size (where reported)?
-
How do the events differ in terms of geo_precision, and what might this indicate about data reliability?
-
What is the distribution of events across different source scales (National vs. Subnational)?
-
Are there any trends in the fatalities reported across different event types?
-
How do the associated actors vary across different types of protests or demonstrations?
-
What insights can be drawn from the time_precision column in relation to the events reported?
-
Is there any correlation between the location of events and the type of source reporting them?
1. Dataset Overview
– Shape: 65,535 events (rows) with 31 features (columns)
– First 5 rows: Shows protest/rally events with 0 fatalities from December 2024
– Key Insight: Early entries suggest many non-violent protest events
2. Fatalities Analysis
Basic Statistics:
– Mean: 0.048 fatalities/event
– Median & Mode: 0 fatalities
– Range: 0-35 fatalities
– Std Dev: 0.386 (low dispersion)
– Skewness: 29.14 (extreme right skew)
– Kurtosis: 1767 (extreme peakedness with heavy tail)
Interpretation:
– 75% of events have 0 fatalities (Q3=0)
– 95%+ events likely have ≤1 fatality
– Extreme outliers exist (max=35 deaths)
– Distribution is non-normal (confirmed by Shapiro-Wilk p=0.000)
Outliers:
– 2,183 events (3.3%) exceed normal range
– Outliers range 1-35 fatalities (mean=1.45)
– Indicates rare but severe violent incidents
3. Event-Type Analysis
By Event Type:
1. Battles: Most deadly (mean=0.93/event)
2. Violence vs Civilians: Second deadliest (mean=0.46)
3. Riots: Most frequent violent event (6,818 cases) but low lethality
By Sub-Event Type:
1. Armed Clashes: 1,173 fatalities
2. Attacks: 1,036 fatalities
3. Mob Violence: 700 fatalities
Key Insight: Organized violence (battles/attacks) deadlier than spontaneous violence
4. Temporal Patterns
– Yearly Analysis: Data shows 2024 entries only (partial year data)
– Monthly Analysis: Time series plot (not shown) would require full-year data
5. Spatial Patterns
– Top Locations: Plot shows specific hotspots (exact locations not listed)
– Geo Analysis: Latitude/longitude data available for mapping clusters
6. Correlation Analysis
– Matrix shows relationships between fatalities and:
– Year (temporal correlation)
– Latitude/Longitude (spatial patterns)
– Exact correlations not shown but methodology correct
7. Data Quality Notes
– No missing values in fatalities column
– High precision: 0-mortality events well-documented
– Source Scale: Mix of national/subnational sources
Key Conclusions
1. Conflict Nature:
– Mostly non-lethal protests (51,409 protest events)
– Occasional high-casualty outbreaks
2. Violence Profile:
– Battles → Highest per-event lethality
– Riots → Most frequent violence type
– Sexual violence exists but rare (20 fatalities)
3. Data Characteristics:
– Zero-inflated distribution
– Requires non-parametric statistical methods
– Outliers represent critical security events
4. Research Implications:
– Focus on armed clashes for casualty prevention
– Protest management appears effective (low fatalities)
– Spatial analysis needed for hotspot identification