Final Project
CS 7250: Information Visualization: Theory and Applications
Dr. Lace Padilla
Group Members:
Satyam Shrivastava
Pritish Arora
(Due) Dec 12, 2023
Visualization is not merely a tool for aesthetic representation; it is a medium through which we can distill intricate historical narratives into compelling stories that resonate with a broad audience.
In embarking on our data visualization journey, we have chosen to delve into the profound and impactful domain of the Vietnam War. Our commitment to this project stems not only from its historical significance but also from the inherent power of visualizations to shed light on complex issues and stimulate meaningful conversations.
In the subsequent sections, we will detail the dataset we have chosen, why we have chosen it, goals from this project, the questions that will guide our exploration, the tools we plan to employ, and our vision for presenting answers through visualizations.
In our exploration of the Vietnam War, we've chosen to focus specifically on the bombings that occurred during this intense period in history. The intensity and scale of the bombings in Vietnam have left an indelible mark on the landscape and the collective memory of the people. By delving into this aspect, we aim to unravel the layers of historical data, providing a comprehensive view of the strategic and operational dynamics that shaped this conflict.
This dataset about records of bombings in Vietnam War is meticulously curated by Theater History of Operations (THOR), as a compilation of historic aerial bombings spanning World War I through Vietnam. With over 4.8 million rows detailing each bombing run, THOR is a valuable resource that has not only aided in locating unexploded ordnance in Southeast Asia but has also contributed to refining Air Force combat tactics. Despite the inherent challenges in the data, such as duplicated sorties and non-standardized mission/operation naming, THOR presents an unparalleled opportunity to analyze and visualize the patterns and impact of Vietnam War bombings.
Data Dictionary can be found here
The Vietnam War bombings dataset holds immense historical significance, offering a window into a pivotal period marked by conflict and complex military operations. The importance of this dataset extends beyond its historical value and its understanding is crucial for several reasons:
Through our exploration of the Vietnam War bombings, we aim to bridge the gap between historical data and contemporary understanding, showcasing the best visualization practices.
In this section, we embark on a journey of exploration into the Vietnam War bombings dataset, outlining key questions that aim to provide both a broad overview and a deep understanding of the historical events captured within. This dual approach, encompassing breadth and depth, serves as an exploratory data analysis (EDA) and lays the foundation for framing a compelling visualization story.
Our initial set of questions aims to cast a wide net, capturing the overarching patterns and characteristics of the Vietnam War bombings:
Building upon the insights gained from the initial breadth of analysis, our follow-up questions aim to delve deeper into specific aspects, unraveling the intricacies of the Vietnam War bombings:
This dual approach, encompassing both breadth and depth of analysis, lays the groundwork for our visualization story. By systematically addressing these questions, we aim to not only uncover historical trends but also to shape a narrative that resonates with the nuances and complexities of the Vietnam War bombings. The visualization story that emerges from this process will serve as a powerful tool for education, reflection, and, ultimately, meaningful conversation.
In this analysis, we employ Python as the primary programming language, along with its associated libraries for data processing, visualization, and analysis. The Jupyter Notebook environment serves as our interactive workspace, seamlessly integrating code, visualizations, and explanatory text.
For data manipulation and preprocessing, we rely on Pandas, which facilitates tasks such as data loading, cleaning, and transformation. In terms of visualization, we utilize Matplotlib and Plotly to create a diverse range of static and interactive plots, charts, graphs, and interactive maps. These libraries empower us to present our insights effectively, ensuring clarity and impact.
These advanced visualization tools, coupled with the concepts learned throughout this course, will elevate our storytelling capabilities, offering a richer and more immersive experience for our audience.
# Import the required libraries
import pandas as pd
pd.set_option('display.max_rows', None)
pd.set_option('display.max_columns', 100)
import numpy as np
import matplotlib.pyplot as plt
import plotly.express as px
import plotly.graph_objects as go
import plotly.io as pio
pio.renderers.default = "notebook_connected"
from raceplotly.plots import barplot
import warnings
warnings.filterwarnings("ignore")
class color:
PURPLE = '\033[95m'
CYAN = '\033[96m'
DARKCYAN = '\033[36m'
BLUE = '\033[94m'
GREEN = '\033[92m'
YELLOW = '\033[93m'
RED = '\033[91m'
BOLD = '\033[1m'
UNDERLINE = '\033[4m'
END = '\033[0m'
Overview:
The data set comprises three main files: Bombing Operations (Fact table), Aircraft Glossary (Dim Table), and Weapons Glossary (Dim table). This combination contains 4.8 million rows, detailing each bombing run with information such as operation details, aircraft used, weapons deployed, and target coordinates.
Data Issues:
Data Quality & Transformations:
Data Quality Summary:
# Load the datasets from CSV files - make sure data folder containing CSV files is in the same folder as Notebook
# Bombing Operations
Bomb_Ops_df = pd.read_csv('./thor-vietnam-war-data/thor_data_vietnam.csv',
encoding='ISO-8859-1', low_memory=False)
# Aircraft Glossary
Air_Gloss_df = pd.read_csv('./thor-vietnam-war-data/THOR_VIET_AIRCRAFT_GLOSS.csv',
encoding='ISO-8859-1')
# Weapons Glossary
Wpn_Gloss_df = pd.read_csv('./thor-vietnam-war-data/THOR_VIET_WEAPON_GLOSS.csv',
encoding='ISO-8859-1')
# Standardize column names to lowercase for Bomb_Ops_df
Bomb_Ops_df.columns = Bomb_Ops_df.columns.str.lower()
# Standardize column names to lowercase for Wpn_Gloss_df
Air_Gloss_df.columns = Air_Gloss_df.columns.str.lower()
# Standardize column names to lowercase for Wpn_Gloss_df
Wpn_Gloss_df.columns = Wpn_Gloss_df.columns.str.lower()
# Check the data dimensions - number of rows and columns
print(f'Bomb_Ops_df: {Bomb_Ops_df.shape[0]} rows, and {Bomb_Ops_df.shape[1]} columns')
print(f'Air_Gloss_df: {Air_Gloss_df.shape[0]} rows, and {Air_Gloss_df.shape[1]} columns')
print(f'Wpn_Gloss_df: {Wpn_Gloss_df.shape[0]} rows, and {Wpn_Gloss_df.shape[1]} columns')
Bomb_Ops_df: 4670416 rows, and 47 columns Air_Gloss_df: 104 rows, and 8 columns Wpn_Gloss_df: 294 rows, and 6 columns
# Rename specific columns
Bomb_Ops_df.rename(columns={'tgtlatdd_ddd_wgs84': 'tgt_latitude',
'tgtlonddd_ddd_wgs84': 'tgt_longitude'},
inplace=True)
# Replace values in msndate column - 19700229 is not a valid date - 1970 was not a leap year
Bomb_Ops_df.loc[Bomb_Ops_df['msndate'] == "19700229", 'msndate'] = "19700228"
Bomb_Ops_df.head(10)
thor_data_viet_id | countryflyingmission | milservice | msndate | sourceid | sourcerecord | valid_aircraft_root | takeofflocation | tgt_latitude | tgt_longitude | tgttype | numweaponsdelivered | timeontarget | weapontype | weapontypeclass | weapontypeweight | aircraft_original | aircraft_root | airforcegroup | airforcesqdn | callsign | flthours | mfunc | mfunc_desc | missionid | numofacft | operationsupported | periodofday | unit | tgtcloudcover | tgtcontrol | tgtcountry | tgtid | tgtorigcoords | tgtorigcoordsformat | tgtweather | additionalinfo | geozone | id | mfunc_desc_class | numweaponsjettisoned | numweaponsreturned | releasealtitude | releasefltspeed | resultsbda | timeofftarget | weaponsloadedweight | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 351 | UNITED STATES OF AMERICA | USAF | 1971-06-05 | 647464 | SEADAB | EC-47 | TAN SON NHUT | NaN | NaN | NaN | 0 | 1005.0 | NaN | NaN | 0 | EC47 | EC47 | NaN | NaN | STEEL 5 | 70 | 34.0 | RADIO DIRECT FINDER | 2624 | 1 | NaN | D | 360TEW | NaN | NaN | CAMBODIA | NaN | NaN | NaN | NaN | UNIT: 360TEW - CALLSIGN: STEEL 5 | NaN | 27135863 | NONKINETIC | -1 | -1 | NaN | NaN | NaN | 1005.0 | 0 |
1 | 2 | UNITED STATES OF AMERICA | USAF | 1972-12-26 | 642778 | SEADAB | EC-47 | NAKHON PHANOM | NaN | NaN | NaN | 0 | 530.0 | NaN | NaN | 0 | EC47 | EC47 | NaN | NaN | BARON 6 | 0 | 74.0 | EXTRACTION (GPES) | 2909 | 1 | NaN | D | 361TEW | NaN | NaN | SOUTH VIETNAM | NaN | NaN | NaN | NaN | UNIT: 361TEW - CALLSIGN: BARON 6 | NaN | 27131177 | NONKINETIC | -1 | -1 | NaN | NaN | NaN | 530.0 | 0 |
2 | 3 | UNITED STATES OF AMERICA | USAF | 1973-07-28 | 642779 | SEADAB | RF-4 | UDORN AB | NaN | NaN | NaN | 0 | 730.0 | NaN | NaN | 0 | RF4 | RF4 | NaN | NaN | ATLANTA | 30 | 18.0 | VISUAL RECCE | 3059 | 1 | NaN | D | 432TRW | NaN | NaN | LAOS | NaN | NaN | NaN | NaN | UNIT: 432TRW - CALLSIGN: ATLANTA | NaN | 27131178 | NONKINETIC | -1 | -1 | NaN | NaN | NaN | 730.0 | 0 |
3 | 4 | UNITED STATES OF AMERICA | USAF | 1970-02-02 | 642780 | SEADAB | A-1 | NAKHON PHANOM | 16.902500 | 106.014166 | TRUCKS | 2 | 1415.0 | BLU27 FIRE BOMB (750) | NaN | 750 | A1 | A1 | NaN | NaN | FF32 | 68 | 1.0 | STRIKE | 1047 | 2 | NaN | N | 56SOW | NaN | NaN | LAOS | NaN | 165409N1060051E | DDMMSSN DDDMMSSE | NaN | UNIT: 56SOW - CALLSIGN: FF32 | XE | 27131179 | KINETIC | -1 | -1 | NaN | NaN | SECONDARY FIRE | 1415.0 | 17400 |
4 | 5 | VIETNAM (SOUTH) | VNAF | 1970-10-08 | 642781 | SEADAB | A-37 | DANANG | 14.945555 | 108.257222 | BASE CAMP AREA | 0 | 1240.0 | NaN | NaN | 0 | A37 | A37 | NaN | NaN | TIGER 41 | 28 | 5.0 | CLOSE AIR SUPPORT | B542 | 2 | NaN | D | 516FS | NaN | NaN | SOUTH VIETNAM | NaN | 145644N1081526E | DDMMSSN DDDMMSSE | NaN | UNIT: 516FS - CALLSIGN: TIGER 41 | ZB | 27131180 | KINETIC | -1 | -1 | NaN | NaN | RNO WEATHER | 1240.0 | 0 |
5 | 6 | UNITED STATES OF AMERICA | USAF | 1970-11-25 | 642782 | SEADAB | F-4 | UBON AB | 19.602222 | 103.597222 | AAA\37MM CR MORE | 6 | 650.0 | MK 82 GP BOMB (500) LD | NaN | 500 | F4 | F4 | NaN | NaN | JASPER | 57 | 1.0 | STRIKE | 1407 | 2 | NaN | D | 8TFW | NaN | NaN | LAOS | NaN | 193608N1033550E | DDMMSSN DDDMMSSE | NaN | UNIT: 8TFW - CALLSIGN: JASPER | UG | 27131181 | KINETIC | -1 | -1 | NaN | NaN | DAMAGED | 650.0 | 31860 |
6 | 7 | UNITED STATES OF AMERICA | USN | 1972-03-08 | 642783 | SEADAB | A-4 | TONKIN GULF | 14.573611 | 106.689722 | TRUCKS | 0 | 1005.0 | NaN | NaN | 0 | A4 | A4 | NaN | NaN | CD H | 16 | 1.0 | STRIKE | 9064 | 2 | NaN | D | 775CTG | NaN | NaN | LAOS | NaN | 143425N1064123E | DDMMSSN DDDMMSSE | NaN | UNIT: 775CTG - CALLSIGN: CD H | XB | 27131182 | KINETIC | -1 | -1 | NaN | NaN | RNO NONVISUAL | 1005.0 | 0 |
7 | 8 | UNITED STATES OF AMERICA | USAF | 1971-12-27 | 642784 | SEADAB | F-4 | UDORN AB | NaN | NaN | NaN | 0 | 0.0 | NaN | NaN | 0 | F4 | F4 | NaN | NaN | FALCON80 | 0 | NaN | NaN | 7661 | 2 | NaN | NaN | 432TRW | NaN | NaN | LAOS | NaN | NaN | NaN | NaN | UNIT: 432TRW - CALLSIGN: FALCON80 | NaN | 27131183 | NONKINETIC | -1 | -1 | NaN | NaN | NaN | 0.0 | 0 |
8 | 9 | UNITED STATES OF AMERICA | USN | 1972-05-24 | 642785 | SEADAB | A-7 | TONKIN GULF | NaN | NaN | NaN | 0 | 0.0 | NaN | NaN | 0 | A7 | A7 | NaN | NaN | CD CS | 0 | NaN | NaN | 9205 | 4 | NaN | NaN | 776CTG | NaN | NaN | NORTH VIETNAM | NaN | NaN | NaN | NaN | UNIT: 776CTG - CALLSIGN: CD CS | NaN | 27131184 | NONKINETIC | -1 | -1 | NaN | NaN | NaN | 0.0 | 0 |
9 | 10 | UNITED STATES OF AMERICA | USAF | 1972-09-12 | 642786 | SEADAB | EC-47 | TAN SON NHUT | NaN | NaN | NaN | 0 | 710.0 | NaN | NaN | 0 | EC47 | EC47 | NaN | NaN | LEGMAN59 | 70 | 34.0 | RADIO DIRECT FINDER | 2618 | 1 | NaN | D | 360TEW | NaN | NaN | SOUTH VIETNAM | NaN | NaN | NaN | NaN | UNIT: 360TEW - CALLSIGN: LEGMAN59 | NaN | 27131185 | NONKINETIC | -1 | -1 | NaN | NaN | NaN | 710.0 | 0 |
Bomb_Ops_df.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 4670416 entries, 0 to 4670415 Data columns (total 47 columns): # Column Dtype --- ------ ----- 0 thor_data_viet_id int64 1 countryflyingmission object 2 milservice object 3 msndate object 4 sourceid int64 5 sourcerecord object 6 valid_aircraft_root object 7 takeofflocation object 8 tgt_latitude float64 9 tgt_longitude float64 10 tgttype object 11 numweaponsdelivered int64 12 timeontarget float64 13 weapontype object 14 weapontypeclass float64 15 weapontypeweight int64 16 aircraft_original object 17 aircraft_root object 18 airforcegroup object 19 airforcesqdn object 20 callsign object 21 flthours int64 22 mfunc object 23 mfunc_desc object 24 missionid object 25 numofacft int64 26 operationsupported object 27 periodofday object 28 unit object 29 tgtcloudcover object 30 tgtcontrol object 31 tgtcountry object 32 tgtid object 33 tgtorigcoords object 34 tgtorigcoordsformat object 35 tgtweather object 36 additionalinfo object 37 geozone object 38 id int64 39 mfunc_desc_class object 40 numweaponsjettisoned int64 41 numweaponsreturned int64 42 releasealtitude float64 43 releasefltspeed float64 44 resultsbda object 45 timeofftarget float64 46 weaponsloadedweight int64 dtypes: float64(7), int64(10), object(30) memory usage: 1.6+ GB
Bomb_Ops_df.isnull().sum()
thor_data_viet_id 0 countryflyingmission 3615 milservice 3249 msndate 0 sourceid 0 sourcerecord 0 valid_aircraft_root 0 takeofflocation 4971 tgt_latitude 1130131 tgt_longitude 1130131 tgttype 1830425 numweaponsdelivered 0 timeontarget 26429 weapontype 2403497 weapontypeclass 4670416 weapontypeweight 0 aircraft_original 482 aircraft_root 482 airforcegroup 4667508 airforcesqdn 4667634 callsign 3300321 flthours 0 mfunc 101111 mfunc_desc 104722 missionid 15670 numofacft 0 operationsupported 1920049 periodofday 199764 unit 493 tgtcloudcover 2440248 tgtcontrol 2150583 tgtcountry 216774 tgtid 4670381 tgtorigcoords 1068209 tgtorigcoordsformat 1092870 tgtweather 2592711 additionalinfo 0 geozone 1168052 id 0 mfunc_desc_class 0 numweaponsjettisoned 0 numweaponsreturned 0 releasealtitude 4667038 releasefltspeed 4668727 resultsbda 4385020 timeofftarget 26429 weaponsloadedweight 0 dtype: int64
# Calculate the percentage of null values in each column
(Bomb_Ops_df.isnull().mean() * 100).round(2)
thor_data_viet_id 0.00 countryflyingmission 0.08 milservice 0.07 msndate 0.00 sourceid 0.00 sourcerecord 0.00 valid_aircraft_root 0.00 takeofflocation 0.11 tgt_latitude 24.20 tgt_longitude 24.20 tgttype 39.19 numweaponsdelivered 0.00 timeontarget 0.57 weapontype 51.46 weapontypeclass 100.00 weapontypeweight 0.00 aircraft_original 0.01 aircraft_root 0.01 airforcegroup 99.94 airforcesqdn 99.94 callsign 70.66 flthours 0.00 mfunc 2.16 mfunc_desc 2.24 missionid 0.34 numofacft 0.00 operationsupported 41.11 periodofday 4.28 unit 0.01 tgtcloudcover 52.25 tgtcontrol 46.05 tgtcountry 4.64 tgtid 100.00 tgtorigcoords 22.87 tgtorigcoordsformat 23.40 tgtweather 55.51 additionalinfo 0.00 geozone 25.01 id 0.00 mfunc_desc_class 0.00 numweaponsjettisoned 0.00 numweaponsreturned 0.00 releasealtitude 99.93 releasefltspeed 99.96 resultsbda 93.89 timeofftarget 0.57 weaponsloadedweight 0.00 dtype: float64
# Assuming Air_Gloss_df is your DataFrame
Air_Gloss_df = Air_Gloss_df.drop_duplicates(subset=['validated_root', 'aircraft_name'])
Air_Gloss_df.head(10)
gloss_id | validated_root | aircraft_name | website_link | aircraft_type | aircraft_shortname | aircraft_application | ac_mission_count | |
---|---|---|---|---|---|---|---|---|
0 | 1 | A-1 | Douglas A-1 Skyraider | http://www.navalaviationmuseum.org/attractions... | Fighter Jet | Skyraider | FIGHTER | 373265 |
1 | 2 | A-26 | Douglas A-26 Invader | http://www.militaryfactory.com/aircraft/detail... | Light Bomber | Invader | BOMBER | 36672 |
2 | 4 | A-37 | Cessna A-37 Dragonfly | http://www.militaryfactory.com/aircraft/detail... | Light ground-attack aircraft | Dragonfly | ATTACK | 282699 |
3 | 5 | A-4 | McDonnell Douglas A-4 Skyhawk | http://www.fighter-planes.com/info/a4-skyhawk.htm | Fighter Jet | Skyhawk | FIGHTER | 390290 |
4 | 6 | A-5 | North American A-5 Vigilante | http://www.militaryfactory.com/aircraft/detail... | Bomber Jet | Vigilante | BOMBER | 10 |
5 | 7 | A-6 | Grumman A-6 Intruder | http://www.militaryfactory.com/aircraft/detail... | Attack Aircraft | Intruder | ATTACK | 148372 |
6 | 8 | A-7 | LTV A-7 Corsair II | http://www.militaryfactory.com/aircraft/detail... | Attack Aircraft | Corsair II | ATTACK | 171983 |
7 | 9 | AC-119 | Fairchild AC-119 Shadow or Stinger | https://en.wikipedia.org/wiki/Fairchild_AC-119 | Military Transport aircraft | Shadow or Stinger | TRANSPORT | 81757 |
8 | 10 | AC-123 | Fairchild C-123 Provider | http://www.warbirdalley.com/c123.htm | Military Transport aircraft | Provider | TRANSPORT | 3435 |
9 | 11 | AC-130 | Lockheed AC-130 Spectre | http://fas.org/man/dod-101/sys/ac/ac-130.htm | Fixed wing ground attack gunship | Spectre | ATTACK | 76620 |
Air_Gloss_df.info()
<class 'pandas.core.frame.DataFrame'> Int64Index: 103 entries, 0 to 103 Data columns (total 8 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 gloss_id 103 non-null int64 1 validated_root 103 non-null object 2 aircraft_name 103 non-null object 3 website_link 103 non-null object 4 aircraft_type 103 non-null object 5 aircraft_shortname 92 non-null object 6 aircraft_application 102 non-null object 7 ac_mission_count 103 non-null int64 dtypes: int64(2), object(6) memory usage: 7.2+ KB
Air_Gloss_df.isnull().sum()
gloss_id 0 validated_root 0 aircraft_name 0 website_link 0 aircraft_type 0 aircraft_shortname 11 aircraft_application 1 ac_mission_count 0 dtype: int64
# Calculate the percentage of null values in each column
(Air_Gloss_df.isnull().mean() * 100).round(2)
gloss_id 0.00 validated_root 0.00 aircraft_name 0.00 website_link 0.00 aircraft_type 0.00 aircraft_shortname 10.68 aircraft_application 0.97 ac_mission_count 0.00 dtype: float64
Wpn_Gloss_df.head(10)
weapon_id | weapontype | weapontype_common_name | weapon_class | weapontype_desc | weapon_count | |
---|---|---|---|---|---|---|
0 | 1 | 100 GP | General Purpose Bomb | BOMB | 100 lb general purpose | 1 |
1 | 2 | 1000 G | Megaboller flash powder bomb | BOMB | 1000 g BKS | 2 |
2 | 3 | 1000LB GP M-65 | An-M65 | BOMB | 1000 lb general purpose | 12776 |
3 | 4 | 1000LB MK-83 | Mark 83 bomb | BOMB | 1000 lb none guidence general purpose bomb | 15522 |
4 | 5 | 1000LB SAP M59 | AN-M59 | BOMB | 1000 lb semi-armor piercing bomb | 454 |
5 | 6 | 100LB FR M-IA2 | NaN | BOMB | NaN | 11858 |
6 | 7 | 100LB GP M-30 | AN-M30 | BOMB | 100 lb general purpose | 4610 |
7 | 8 | 100LB M-28 | NaN | BOMB | NaN | 1639 |
8 | 9 | 100LB PWP M-47 | M47 | BOMB | 100 lb chemical bomb | 9970 |
9 | 10 | 105 HOWITZER AMMO | Howitzer ammo | GUN | 105mm Howitzer ammo | 51 |
Wpn_Gloss_df.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 294 entries, 0 to 293 Data columns (total 6 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 weapon_id 294 non-null int64 1 weapontype 294 non-null object 2 weapontype_common_name 175 non-null object 3 weapon_class 294 non-null object 4 weapontype_desc 176 non-null object 5 weapon_count 294 non-null int64 dtypes: int64(2), object(4) memory usage: 13.9+ KB
Wpn_Gloss_df.isnull().sum()
weapon_id 0 weapontype 0 weapontype_common_name 119 weapon_class 0 weapontype_desc 118 weapon_count 0 dtype: int64
# Calculate the percentage of null values in each column
(Wpn_Gloss_df.isnull().mean() * 100).round(2)
weapon_id 0.00 weapontype 0.00 weapontype_common_name 40.48 weapon_class 0.00 weapontype_desc 40.14 weapon_count 0.00 dtype: float64
# Convert msndate to datetime format and store as a new field
Bomb_Ops_df['msndatetime'] = pd.to_datetime(Bomb_Ops_df['msndate'], errors='coerce')
# Convert msndate to date format
Bomb_Ops_df['msndate'] = Bomb_Ops_df['msndatetime'].dt.strftime('%Y-%m-%d')
# Create additional time-related column - msnyear
Bomb_Ops_df['msnyear'] = Bomb_Ops_df['msndatetime'].dt.year
# Create additional time-related column - msnyearmonth
Bomb_Ops_df['msnyearmonth'] = Bomb_Ops_df['msndatetime'] + pd.offsets.MonthBegin(0)
# Create additional time-related column - msnmonthname
Bomb_Ops_df['msnmonthname'] = Bomb_Ops_df['msndatetime'].dt.strftime('%b')
# Harmonize the operationsupported to new field operation_grp
Bomb_Ops_df['operation_grp'] = Bomb_Ops_df['operationsupported'].str.split(" - |- ").str[0]
# Replace NaN with "UNNAMED"
Bomb_Ops_df['operation_grp'] = Bomb_Ops_df['operation_grp'].fillna("UNNAMED")
# Replace empty strings with "UNNAMED"
Bomb_Ops_df['operation_grp'] = Bomb_Ops_df['operation_grp'].replace("", "UNNAMED")
# Merge (Left Join) Bomb_Ops_df with Air_Gloss_df
# on valid_aircraft_root in Bomb_Ops_df and validated_root in Air_Gloss_df
temp_df = pd.merge(Bomb_Ops_df,
Air_Gloss_df,
left_on='valid_aircraft_root',
right_on='validated_root',
how='left')
# Merge (Left Join) the above output with Wpn_Gloss_df on weapontype in Wpn_Gloss_df
bombings_df = pd.merge(temp_df,
Wpn_Gloss_df,
left_on='weapontype',
right_on='weapontype',
how='left')
# Check the data dimensions - number of rows and columns
print(f'Final dataframe: bombings_df, consists {bombings_df.shape[0]} rows, and {bombings_df.shape[1]} columns')
Final dataframe: bombings_df, consists 4670416 rows, and 65 columns
bombings_df.head(10)
thor_data_viet_id | countryflyingmission | milservice | msndate | sourceid | sourcerecord | valid_aircraft_root | takeofflocation | tgt_latitude | tgt_longitude | tgttype | numweaponsdelivered | timeontarget | weapontype | weapontypeclass | weapontypeweight | aircraft_original | aircraft_root | airforcegroup | airforcesqdn | callsign | flthours | mfunc | mfunc_desc | missionid | numofacft | operationsupported | periodofday | unit | tgtcloudcover | tgtcontrol | tgtcountry | tgtid | tgtorigcoords | tgtorigcoordsformat | tgtweather | additionalinfo | geozone | id | mfunc_desc_class | numweaponsjettisoned | numweaponsreturned | releasealtitude | releasefltspeed | resultsbda | timeofftarget | weaponsloadedweight | msndatetime | msnyear | msnyearmonth | msnmonthname | operation_grp | gloss_id | validated_root | aircraft_name | website_link | aircraft_type | aircraft_shortname | aircraft_application | ac_mission_count | weapon_id | weapontype_common_name | weapon_class | weapontype_desc | weapon_count | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 351 | UNITED STATES OF AMERICA | USAF | 1971-06-05 | 647464 | SEADAB | EC-47 | TAN SON NHUT | NaN | NaN | NaN | 0 | 1005.0 | NaN | NaN | 0 | EC47 | EC47 | NaN | NaN | STEEL 5 | 70 | 34.0 | RADIO DIRECT FINDER | 2624 | 1 | NaN | D | 360TEW | NaN | NaN | CAMBODIA | NaN | NaN | NaN | NaN | UNIT: 360TEW - CALLSIGN: STEEL 5 | NaN | 27135863 | NONKINETIC | -1 | -1 | NaN | NaN | NaN | 1005.0 | 0 | 1971-06-05 | 1971 | 1971-07-01 | Jun | UNNAMED | 43.0 | EC-47 | Douglas C-47 Skytrain | https://en.wikipedia.org/wiki/Douglas_C-47_Sky... | Military Transport aircraft | Skytrain | TRANSPORT | 59034.0 | NaN | NaN | NaN | NaN | NaN |
1 | 2 | UNITED STATES OF AMERICA | USAF | 1972-12-26 | 642778 | SEADAB | EC-47 | NAKHON PHANOM | NaN | NaN | NaN | 0 | 530.0 | NaN | NaN | 0 | EC47 | EC47 | NaN | NaN | BARON 6 | 0 | 74.0 | EXTRACTION (GPES) | 2909 | 1 | NaN | D | 361TEW | NaN | NaN | SOUTH VIETNAM | NaN | NaN | NaN | NaN | UNIT: 361TEW - CALLSIGN: BARON 6 | NaN | 27131177 | NONKINETIC | -1 | -1 | NaN | NaN | NaN | 530.0 | 0 | 1972-12-26 | 1972 | 1973-01-01 | Dec | UNNAMED | 43.0 | EC-47 | Douglas C-47 Skytrain | https://en.wikipedia.org/wiki/Douglas_C-47_Sky... | Military Transport aircraft | Skytrain | TRANSPORT | 59034.0 | NaN | NaN | NaN | NaN | NaN |
2 | 3 | UNITED STATES OF AMERICA | USAF | 1973-07-28 | 642779 | SEADAB | RF-4 | UDORN AB | NaN | NaN | NaN | 0 | 730.0 | NaN | NaN | 0 | RF4 | RF4 | NaN | NaN | ATLANTA | 30 | 18.0 | VISUAL RECCE | 3059 | 1 | NaN | D | 432TRW | NaN | NaN | LAOS | NaN | NaN | NaN | NaN | UNIT: 432TRW - CALLSIGN: ATLANTA | NaN | 27131178 | NONKINETIC | -1 | -1 | NaN | NaN | NaN | 730.0 | 0 | 1973-07-28 | 1973 | 1973-08-01 | Jul | UNNAMED | 85.0 | RF-4 | McDonnell F-4 Phantom II | https://en.wikipedia.org/wiki/McDonnell_Dougla... | Fighter bomber jet | Phantom II | FIGHTER, BOMBER | 243259.0 | NaN | NaN | NaN | NaN | NaN |
3 | 4 | UNITED STATES OF AMERICA | USAF | 1970-02-02 | 642780 | SEADAB | A-1 | NAKHON PHANOM | 16.902500 | 106.014166 | TRUCKS | 2 | 1415.0 | BLU27 FIRE BOMB (750) | NaN | 750 | A1 | A1 | NaN | NaN | FF32 | 68 | 1.0 | STRIKE | 1047 | 2 | NaN | N | 56SOW | NaN | NaN | LAOS | NaN | 165409N1060051E | DDMMSSN DDDMMSSE | NaN | UNIT: 56SOW - CALLSIGN: FF32 | XE | 27131179 | KINETIC | -1 | -1 | NaN | NaN | SECONDARY FIRE | 1415.0 | 17400 | 1970-02-02 | 1970 | 1970-03-01 | Feb | UNNAMED | 1.0 | A-1 | Douglas A-1 Skyraider | http://www.navalaviationmuseum.org/attractions... | Fighter Jet | Skyraider | FIGHTER | 373265.0 | 76.0 | BLU-27/B | BOMB | "(750 lb) class fire bombs was very similar to... | 8633.0 |
4 | 5 | VIETNAM (SOUTH) | VNAF | 1970-10-08 | 642781 | SEADAB | A-37 | DANANG | 14.945555 | 108.257222 | BASE CAMP AREA | 0 | 1240.0 | NaN | NaN | 0 | A37 | A37 | NaN | NaN | TIGER 41 | 28 | 5.0 | CLOSE AIR SUPPORT | B542 | 2 | NaN | D | 516FS | NaN | NaN | SOUTH VIETNAM | NaN | 145644N1081526E | DDMMSSN DDDMMSSE | NaN | UNIT: 516FS - CALLSIGN: TIGER 41 | ZB | 27131180 | KINETIC | -1 | -1 | NaN | NaN | RNO WEATHER | 1240.0 | 0 | 1970-10-08 | 1970 | 1970-11-01 | Oct | UNNAMED | 4.0 | A-37 | Cessna A-37 Dragonfly | http://www.militaryfactory.com/aircraft/detail... | Light ground-attack aircraft | Dragonfly | ATTACK | 282699.0 | NaN | NaN | NaN | NaN | NaN |
5 | 6 | UNITED STATES OF AMERICA | USAF | 1970-11-25 | 642782 | SEADAB | F-4 | UBON AB | 19.602222 | 103.597222 | AAA\37MM CR MORE | 6 | 650.0 | MK 82 GP BOMB (500) LD | NaN | 500 | F4 | F4 | NaN | NaN | JASPER | 57 | 1.0 | STRIKE | 1407 | 2 | NaN | D | 8TFW | NaN | NaN | LAOS | NaN | 193608N1033550E | DDMMSSN DDDMMSSE | NaN | UNIT: 8TFW - CALLSIGN: JASPER | UG | 27131181 | KINETIC | -1 | -1 | NaN | NaN | DAMAGED | 650.0 | 31860 | 1970-11-25 | 1970 | 1970-12-01 | Nov | UNNAMED | 54.0 | F-4 | McDonnell Douglas F-4 Phantom II | https://en.wikipedia.org/wiki/McDonnell_Dougla... | Fighter Jet Bomber | Phantom II | FIGHTER, BOMBER | 957427.0 | 205.0 | MK 82 | BOMB | "free-fall, nonguided general purpose (GP) 500... | 62921.0 |
6 | 7 | UNITED STATES OF AMERICA | USN | 1972-03-08 | 642783 | SEADAB | A-4 | TONKIN GULF | 14.573611 | 106.689722 | TRUCKS | 0 | 1005.0 | NaN | NaN | 0 | A4 | A4 | NaN | NaN | CD H | 16 | 1.0 | STRIKE | 9064 | 2 | NaN | D | 775CTG | NaN | NaN | LAOS | NaN | 143425N1064123E | DDMMSSN DDDMMSSE | NaN | UNIT: 775CTG - CALLSIGN: CD H | XB | 27131182 | KINETIC | -1 | -1 | NaN | NaN | RNO NONVISUAL | 1005.0 | 0 | 1972-03-08 | 1972 | 1972-04-01 | Mar | UNNAMED | 5.0 | A-4 | McDonnell Douglas A-4 Skyhawk | http://www.fighter-planes.com/info/a4-skyhawk.htm | Fighter Jet | Skyhawk | FIGHTER | 390290.0 | NaN | NaN | NaN | NaN | NaN |
7 | 8 | UNITED STATES OF AMERICA | USAF | 1971-12-27 | 642784 | SEADAB | F-4 | UDORN AB | NaN | NaN | NaN | 0 | 0.0 | NaN | NaN | 0 | F4 | F4 | NaN | NaN | FALCON80 | 0 | NaN | NaN | 7661 | 2 | NaN | NaN | 432TRW | NaN | NaN | LAOS | NaN | NaN | NaN | NaN | UNIT: 432TRW - CALLSIGN: FALCON80 | NaN | 27131183 | NONKINETIC | -1 | -1 | NaN | NaN | NaN | 0.0 | 0 | 1971-12-27 | 1971 | 1972-01-01 | Dec | UNNAMED | 54.0 | F-4 | McDonnell Douglas F-4 Phantom II | https://en.wikipedia.org/wiki/McDonnell_Dougla... | Fighter Jet Bomber | Phantom II | FIGHTER, BOMBER | 957427.0 | NaN | NaN | NaN | NaN | NaN |
8 | 9 | UNITED STATES OF AMERICA | USN | 1972-05-24 | 642785 | SEADAB | A-7 | TONKIN GULF | NaN | NaN | NaN | 0 | 0.0 | NaN | NaN | 0 | A7 | A7 | NaN | NaN | CD CS | 0 | NaN | NaN | 9205 | 4 | NaN | NaN | 776CTG | NaN | NaN | NORTH VIETNAM | NaN | NaN | NaN | NaN | UNIT: 776CTG - CALLSIGN: CD CS | NaN | 27131184 | NONKINETIC | -1 | -1 | NaN | NaN | NaN | 0.0 | 0 | 1972-05-24 | 1972 | 1972-06-01 | May | UNNAMED | 8.0 | A-7 | LTV A-7 Corsair II | http://www.militaryfactory.com/aircraft/detail... | Attack Aircraft | Corsair II | ATTACK | 171983.0 | NaN | NaN | NaN | NaN | NaN |
9 | 10 | UNITED STATES OF AMERICA | USAF | 1972-09-12 | 642786 | SEADAB | EC-47 | TAN SON NHUT | NaN | NaN | NaN | 0 | 710.0 | NaN | NaN | 0 | EC47 | EC47 | NaN | NaN | LEGMAN59 | 70 | 34.0 | RADIO DIRECT FINDER | 2618 | 1 | NaN | D | 360TEW | NaN | NaN | SOUTH VIETNAM | NaN | NaN | NaN | NaN | UNIT: 360TEW - CALLSIGN: LEGMAN59 | NaN | 27131185 | NONKINETIC | -1 | -1 | NaN | NaN | NaN | 710.0 | 0 | 1972-09-12 | 1972 | 1972-10-01 | Sep | UNNAMED | 43.0 | EC-47 | Douglas C-47 Skytrain | https://en.wikipedia.org/wiki/Douglas_C-47_Sky... | Military Transport aircraft | Skytrain | TRANSPORT | 59034.0 | NaN | NaN | NaN | NaN | NaN |
bombings_df.shape
(4670416, 65)
bombings_df.isnull().sum()
thor_data_viet_id 0 countryflyingmission 3615 milservice 3249 msndate 0 sourceid 0 sourcerecord 0 valid_aircraft_root 0 takeofflocation 4971 tgt_latitude 1130131 tgt_longitude 1130131 tgttype 1830425 numweaponsdelivered 0 timeontarget 26429 weapontype 2403497 weapontypeclass 4670416 weapontypeweight 0 aircraft_original 482 aircraft_root 482 airforcegroup 4667508 airforcesqdn 4667634 callsign 3300321 flthours 0 mfunc 101111 mfunc_desc 104722 missionid 15670 numofacft 0 operationsupported 1920049 periodofday 199764 unit 493 tgtcloudcover 2440248 tgtcontrol 2150583 tgtcountry 216774 tgtid 4670381 tgtorigcoords 1068209 tgtorigcoordsformat 1092870 tgtweather 2592711 additionalinfo 0 geozone 1168052 id 0 mfunc_desc_class 0 numweaponsjettisoned 0 numweaponsreturned 0 releasealtitude 4667038 releasefltspeed 4668727 resultsbda 4385020 timeofftarget 26429 weaponsloadedweight 0 msndatetime 0 msnyear 0 msnyearmonth 0 msnmonthname 0 operation_grp 0 gloss_id 52734 validated_root 52734 aircraft_name 52734 website_link 52734 aircraft_type 52734 aircraft_shortname 105435 aircraft_application 52757 ac_mission_count 52734 weapon_id 2403700 weapontype_common_name 3206777 weapon_class 2403700 weapontype_desc 3146824 weapon_count 2403700 dtype: int64
# Calculate the percentage of null values in each column
(bombings_df.isnull().mean() * 100).round(2)
thor_data_viet_id 0.00 countryflyingmission 0.08 milservice 0.07 msndate 0.00 sourceid 0.00 sourcerecord 0.00 valid_aircraft_root 0.00 takeofflocation 0.11 tgt_latitude 24.20 tgt_longitude 24.20 tgttype 39.19 numweaponsdelivered 0.00 timeontarget 0.57 weapontype 51.46 weapontypeclass 100.00 weapontypeweight 0.00 aircraft_original 0.01 aircraft_root 0.01 airforcegroup 99.94 airforcesqdn 99.94 callsign 70.66 flthours 0.00 mfunc 2.16 mfunc_desc 2.24 missionid 0.34 numofacft 0.00 operationsupported 41.11 periodofday 4.28 unit 0.01 tgtcloudcover 52.25 tgtcontrol 46.05 tgtcountry 4.64 tgtid 100.00 tgtorigcoords 22.87 tgtorigcoordsformat 23.40 tgtweather 55.51 additionalinfo 0.00 geozone 25.01 id 0.00 mfunc_desc_class 0.00 numweaponsjettisoned 0.00 numweaponsreturned 0.00 releasealtitude 99.93 releasefltspeed 99.96 resultsbda 93.89 timeofftarget 0.57 weaponsloadedweight 0.00 msndatetime 0.00 msnyear 0.00 msnyearmonth 0.00 msnmonthname 0.00 operation_grp 0.00 gloss_id 1.13 validated_root 1.13 aircraft_name 1.13 website_link 1.13 aircraft_type 1.13 aircraft_shortname 2.26 aircraft_application 1.13 ac_mission_count 1.13 weapon_id 51.47 weapontype_common_name 68.66 weapon_class 51.47 weapontype_desc 67.38 weapon_count 51.47 dtype: float64
Delving into the complex trajectory of the Vietnam War, this section employs visualizations to unravel critical facets of the conflict.
Explore the spatial dynamics of bombing activities across Vietnam. Witness the shifting patterns over the war period via animated scatter map plot on top of Vietnam and neighboring regions, revealing nuanced insights into the strategic choices made.
With the help of animation in this chart, we are covering the depth question: 6.1.1 Geographical Distribution: How does the geographical distribution change over the war period?
# Creating the subset of colulmns for map visual
subset_columns = ['msndate', 'tgt_latitude', 'tgt_longitude']
map_data = bombings_df[subset_columns]
map_data = map_data.sort_values(["msndate"])
map_data['size'] = 1
map_data = map_data.iloc[:50000]
# Map plot using Plotly
fig = px.scatter_mapbox(map_data,
lat="tgt_latitude",
lon="tgt_longitude",
animation_frame="msndate",
animation_group="tgt_longitude",
color_continuous_scale=px.colors.sequential.Hot,
color_discrete_sequence=["red"],
zoom = 4,
size = 'size',
size_max = 3,
labels={'msndate':'Mission Date'})
fig.update_layout(mapbox_style="carto-positron")
fig.update_layout(margin={"r":0,"t":0,"l":0,"b":0})
fig.update_layout(autosize=False, width=800, height=600)
print(color.BOLD + 'Geographical Distribution of Strikes:' + color.END)
print('Use Play button at bottom to visualize the strikes. Reload the page if map does not respond.')
fig.show()
Geographical Distribution of Strikes:
Use Play button at bottom to visualize the strikes. Reload the page if map does not respond.
As we begin our journey, we dive into the geography of Vietnam. The animated scatter map paints a vivid picture of the intense bombing campaigns, revealing how different regions bore the brunt of the conflict. Watch as the red dots unfold, capturing the ebb and flow of strategic decisions across the Vietnamese landscape.
Geographical Distribution of Strikes
Uncover the bombing intensity patterns through a chronological lens. Identify spikes, lulls, and correlations with historical events, offering a temporal perspective on the war's progression.
strikes_per_year = Bomb_Ops_df.groupby(['msnyear']).agg({'thor_data_viet_id':'count',
'id':'count','sourceid':'count'}).reset_index()
strikes_per_year
msnyear | thor_data_viet_id | id | sourceid | |
---|---|---|---|---|
0 | 1965 | 70475 | 70475 | 70475 |
1 | 1966 | 412202 | 412202 | 412202 |
2 | 1967 | 593838 | 593838 | 593838 |
3 | 1968 | 778803 | 778803 | 778803 |
4 | 1969 | 539349 | 539349 | 539349 |
5 | 1970 | 806141 | 806141 | 806141 |
6 | 1971 | 485192 | 485192 | 485192 |
7 | 1972 | 603243 | 603243 | 603243 |
8 | 1973 | 248143 | 248143 | 248143 |
9 | 1974 | 104702 | 104702 | 104702 |
10 | 1975 | 28328 | 28328 | 28328 |
# Create a bar graph using Plotly Express
fig = px.bar(strikes_per_year,
x='msnyear',
y='thor_data_viet_id',
title='Bombing Strikes each Year',
labels={'Value': 'Count'})
fig.update_xaxes(tickmode='linear')
fig.update_layout(xaxis_title='Year', yaxis_title='Number of Strikes')
# Annotate each bar with its count
for i, row in strikes_per_year.iterrows():
fig.add_annotation(
x=row['msnyear'],
y=row['thor_data_viet_id'],
text=str(row['thor_data_viet_id']),
showarrow=True,
arrowhead=2,
arrowsize=1,
arrowwidth=2,
arrowcolor="#636363",
ax=0,
ay=-40,
)
# Show the plot
fig.show()
Moving through time, our visualizations provide a unique lens into the temporal patterns of bombing activities. The bar graphs present a bird's-eye view of the overall bombing trends. The line charts delve deeper, unraveling the stories behind each spike and lull, drawing connections to key historical events that shaped the trajectory of the war.
Bar Graph for Yearly Bombing Strikes
Uncover patterns that highlight distinct periods of heightened or reduced activity, focusing on the specific countries involved. This exploration aims to provide a detailed understanding of how bombing intensity fluctuated over time.
# Group the data by date and country and count the number of unique mission IDs for each date and country
missions_by_date_country = bombings_df.groupby(['msnyear',
'msndate',
'countryflyingmission'])['thor_data_viet_id'].nunique().reset_index()
missions_by_date_country.columns = ['year', 'date', 'country', 'Number of Missions']
# Create a custom color mapping
color_scale = px.colors.qualitative.Set1 # You can choose a different color scale
country_colors = dict(zip(missions_by_date_country['country'].unique(), color_scale))
# Create a line chart using Plotly with multiple lines for each country
fig = px.line(missions_by_date_country, x='date', y='Number of Missions', color='country',
title='Number of Missions over Time by Country',
labels={'date': 'Date', 'Number of Missions': 'Number of Missions'},
color_discrete_map=country_colors)
# Update legend position
fig.update_layout(legend=dict(
yanchor="top",
y=1,
xanchor="right",
x=1
))
# Show the chart
fig.show()
Zooming in on specific countries, our visualizations reveal the heartbeat of the war. The interactive line charts allow us to dissect notable spikes and lulls, connecting each rhythm to the unique strategies employed by different nations during critical junctures of the Vietnam War.
Line Charts for Bombs Dropped Over Time (By Country)
Explore the collaborative dynamics of allied nations in the context of bombing activities throughout the Vietnam War. This analysis involves isolating the contributions of allied countries, excluding the USA, to reveal their individual patterns of involvement over the war period.
# Filter out USA
missions_by_date_country_allies = missions_by_date_country[missions_by_date_country['country'] != 'UNITED STATES OF AMERICA']
# Create a line chart using Plotly with multiple lines for each country
fig_allies = px.line(missions_by_date_country_allies, x='date', y='Number of Missions', color='country',
title='Number of Missions over Time by US Allies',
labels={'date': 'Date', 'Number of Missions': 'Number of Missions'},
color_discrete_map=country_colors) # Use the same color mapping
# Update legend position
fig_allies.update_layout(legend=dict(
yanchor="top",
y=1,
xanchor="right",
x=1
))
# Show the chart for US Allies
fig_allies.show()
Allies played a crucial role in shaping the course of the conflict. By isolating their contributions, our line chart offers a nuanced view of how different nations, working in tandem, influenced the ebb and flow of bombing campaigns. Witness the collaboration and individual strategies of these allied forces.
Line Chart for Allied Countries (Excluding USA)
Examine the intricate interplay between temporal patterns in bombing activities and major historical events during the Vietnam War. Through an interactive line chart encompassing all involved countries, this exploration allows users to discern correlations between spikes or lulls in bombing intensity and significant historical occurrences.
The addition of interactive elements, such as a dropdown for selecting specific years and the ability to click on country legends for individual focus, enhances the depth of analysis, enabling a more personalized exploration of the data's temporal narrative.
# Create traces for each country
traces = []
for country in missions_by_date_country['country'].unique():
trace = go.Scatter(
x=missions_by_date_country[missions_by_date_country['country'] == country]['date'],
y=missions_by_date_country[missions_by_date_country['country'] == country]['Number of Missions'],
mode='lines',
name=country,
line=dict(color=country_colors[country])
)
traces.append(trace)
# Create the layout
layout = go.Layout(
title='Number of Missions over Time by Country',
xaxis=dict(title='Date'),
yaxis=dict(title='Number of Missions'),
legend=dict(title=dict(text='Country')), # Add a title to the legend
annotations=[
dict(
text='Select Year:', # Name for the dropdown
x=0.89, # Adjust the position of the dropdown name
xref='paper', # Set the x coordinate to be a fraction of the entire plot
y=1.08, # Adjust the position of the dropdown name
yref='paper', # Set the y coordinate to be a fraction of the entire plot
showarrow=False,
)
],
updatemenus=[
dict(
type='dropdown',
showactive=False,
buttons=[
dict(label='All',
method='relayout',
args=['xaxis.range', [missions_by_date_country['date'].min(), missions_by_date_country['date'].max()]]),
*[
dict(label=str(year),
method='relayout',
args=[{'xaxis.range': [missions_by_date_country[missions_by_date_country['year'] == year]['date'].min(),
missions_by_date_country[missions_by_date_country['year'] == year]['date'].max()]}])
for year in range(1965, 1976)
]
],
direction="down",
x=0.89, # Adjust the position of the dropdown menu
xanchor='left', # Set the anchor point for the x position
y=1.1, # Adjust the position of the dropdown menu
yanchor='top' # Set the anchor point for the y position
),
]
)
# Create the figure
fig = go.Figure(data=traces, layout=layout)
print('Interactive Line Chart: Use "Select Year" dropdown on right top corner.')
print('Interactive Line Chart: Click on Country legend to filter that country in plot.')
# Show the chart
fig.show()
Interactive Line Chart: Use "Select Year" dropdown on right top corner. Interactive Line Chart: Click on Country legend to filter that country in plot.
The interweaving of temporal patterns with historical events is a captivating aspect of our exploration. The interactive line chart provides a dynamic canvas where users can select specific years and countries, unveiling the complex dance between historical shifts and the strategic choices reflected in the intensity of bombing campaigns.
Interactive Line Chart for Bombs Dropped Over Time (By Country)
Visualize the sheer magnitude of bombs dropped, discerning periods of escalation or de-escalation. Connect tonnage data with overarching military strategies, illuminating the strategic shifts in the conflict.
total_tonnage_by_target = bombings_df[['weapontypeweight','numweaponsdelivered','tgtcountry']]
total_tonnage_by_target['Total Tonnage'] = total_tonnage_by_target['weapontypeweight'] * total_tonnage_by_target['numweaponsdelivered']
tonnage_bombings_at_target = total_tonnage_by_target.groupby(['tgtcountry']).agg({'Total Tonnage':'sum'}).reset_index()
tonnage_bombings_at_target
tgtcountry | Total Tonnage | |
---|---|---|
0 | CAMBODIA | 1406510926 |
1 | LAOS | 5383460570 |
2 | NORTH VIETNAM | 1479669079 |
3 | PHILLIPINES | 0 |
4 | SOUTH VIETNAM | 7996763648 |
5 | THAILAND | 2195963 |
6 | UNKNOWN | 79343 |
7 | WESTPAC WATERS | 22651 |
# Create a bubble chart using Plotly Express
fig = px.scatter(tonnage_bombings_at_target,
x='tgtcountry',
y='Total Tonnage',
size=tonnage_bombings_at_target['Total Tonnage'],
title='Bombing Tonnage (Bubble Chart)',
labels={'Total Tonnage': 'Tonnage'},
size_max=145,
color_discrete_sequence=['red']) # Adjust the size of bubbles as needed
fig.update_xaxes(tickmode='linear')
fig.update_layout(xaxis_title='Year', yaxis_title='Tonnage')
# Add annotations
for i, row in tonnage_bombings_at_target.iterrows():
fig.add_annotation(
x=row['tgtcountry'],
y=row['Total Tonnage'],
text=f"{row['Total Tonnage']/1000000000:.2f} B", # Format tonnage using scientific notation
showarrow=False
)
# Show the plot
fig.show()
The weight of the war is brought to light as we delve into the tonnage of bombs dropped. The bubble chart provides a compelling visualization, allowing us to grasp the scale of military operations. Each bubble, representing tonnage, unveils strategic choices and highlights pivotal moments where the conflict's intensity escalated or de-escalated.
Bubble Chart for Tonnage at Each Target Country
Dive into the temporal dynamics of tonnage, examining how the overall tonnage of bombs dropped evolved throughout the Vietnam War. Uncover patterns, identifying periods of notable escalation or de-escalation, and consider the correlation with key historical events or shifts in military strategies
total_tonnage_by_year = bombings_df[['weapontypeweight','numweaponsdelivered','msnyear']]
total_tonnage_by_year['total_tonnage'] = total_tonnage_by_year['weapontypeweight'] * total_tonnage_by_year['numweaponsdelivered']
tonnage_bombings = total_tonnage_by_year.groupby(['msnyear']).agg({'total_tonnage':'sum'}).reset_index()
tonnage_bombings
msnyear | total_tonnage | |
---|---|---|
0 | 1965 | 194602627 |
1 | 1966 | 791268177 |
2 | 1967 | 1416718000 |
3 | 1968 | 2393304314 |
4 | 1969 | 2194342707 |
5 | 1970 | 2924754828 |
6 | 1971 | 2240960399 |
7 | 1972 | 3012978078 |
8 | 1973 | 1176865061 |
9 | 1974 | 63390694 |
10 | 1975 | 25206037 |
# Create a bar graph using Plotly Express
fig = px.bar(tonnage_bombings,
x='msnyear',
y='total_tonnage',
title='Bombing Tonnage each Year',
labels={'Value': 'Count'})
fig.update_xaxes(tickmode='linear')
fig.update_layout(xaxis_title='Year', yaxis_title='Tonnage')
# Annotate each bar with its tonnage
for i, row in tonnage_bombings.iterrows():
fig.add_annotation(
x=row['msnyear'],
y=row['total_tonnage'],
text=f"{row['total_tonnage']}",
showarrow=True,
arrowhead=2,
arrowsize=1,
arrowwidth=2,
arrowcolor="#636363",
ax=0,
ay=-40,
)
# Show the plot
fig.show()
Our journey through tonnage continues with a detailed examination of its temporal evolution. The bar graph serves as a timeline, showcasing how the weight of bombings shifted across the war years. Explore key moments of escalation and de-escalation, connecting these fluctuations with historical events that shaped the strategic landscape.
Bar Graph for Yearly Tonnage of Bombs Dropped
Shift the focus to the geographical impact by exploring tonnage on a country level. Visualize the distribution of tonnage across target countries, revealing variations and intensity changes. Analyze how tonnage correlates with the strategic importance of each target country throughout the war.
The play and pause functionality allows for controlled observation, unraveling the shifting dynamics of tonnage over the course of the war.
total_tonnage_on_country_year = bombings_df[['tgtcountry',
'weapontypeweight',
'numweaponsdelivered',
'msnyear']]
total_tonnage_on_country_year['total_tonnage'] = total_tonnage_on_country_year['weapontypeweight'] * total_tonnage_on_country_year['numweaponsdelivered']
total_tonnage_on_country_year = total_tonnage_on_country_year.groupby(['msnyear', 'tgtcountry']).agg({'total_tonnage':'sum'}).reset_index()
# Create a pivot table to fill in missing values with 0
pivot_table = total_tonnage_on_country_year.pivot_table(index='msnyear', columns='tgtcountry', values='total_tonnage', fill_value=0).reset_index()
# If needed, flatten the DataFrame
pivot_table = pivot_table.melt(id_vars='msnyear', var_name='tgtcountry', value_name='total_tonnage')
# Sort the pivot table by 'msnyear' and 'total_tonnage'
pivot_table = pivot_table.sort_values(by=['msnyear', 'total_tonnage'], ascending=[True, False])
my_raceplot = barplot(pivot_table,
item_column='tgtcountry',
value_column='total_tonnage',
time_column='msnyear',
top_entries = 10)
print('Animated Map: Use Play button at bottom to visualize with animation.')
my_raceplot.plot(title = 'Total Tonnage Bombing on each Target per Year',
item_label='Target Country',
value_label='Total tonnage',
time_label = 'Year: ',
frame_duration=2000)
Animated Map: Use Play button at bottom to visualize with animation.
Our exploration of tonnage takes us to the heart of targeted regions. The interactive racing bar chart allows us to witness the ebb and flow of tonnage over each country, uncovering the specific impact on different regions. With play and pause functionality, observe how strategic priorities shaped the distribution of tonnage across the geopolitical map.
Interactive Racing Bar Chart for Tonnage Over Time (By Target Country)
Delve into the specific types of targets by exploring tonnage variations over different target types. Uncover patterns and trends in tonnage concerning various target categories, shedding light on the priorities and strategies in bombing campaigns over the war's duration.
The play and pause feature offers a granular exploration of how tonnage levels evolved over time, shedding light on the strategic preferences in bomb targeting.
total_tonnage_by_target_type = bombings_df[['weapontypeweight','numweaponsdelivered','tgttype','msnyear']]
total_tonnage_by_target_type['total_tonnage'] = total_tonnage_by_target_type['weapontypeweight'] * total_tonnage_by_target_type['numweaponsdelivered']
total_tonnage_by_target_type = total_tonnage_by_target_type.groupby(['msnyear', 'tgttype']).agg({'total_tonnage':'sum'}).reset_index()
# Create a pivot table to fill in missing values with 0
pivot_table0 = total_tonnage_by_target_type.pivot_table(index='msnyear',
columns='tgttype',
values='total_tonnage',
fill_value=0).reset_index()
# If needed, flatten the DataFrame
pivot_table0 = pivot_table0.melt(id_vars='msnyear', var_name='tgttype', value_name='total_tonnage')
my_raceplot = barplot(pivot_table0,
item_column='tgttype',
value_column='total_tonnage',
time_column='msnyear',
top_entries = 10)
print('Animated Map: Use Play button at bottom to visualize with animation.')
my_raceplot.plot(title = 'Total Tonnage Bombing on Top-10 Target Type per Year',
item_label='Target Type',
value_label='Total tonnage',
time_label = 'Year: ',
frame_duration=2000)
Animated Map: Use Play button at bottom to visualize with animation.
Zooming in further, we scrutinize the specific targets of the bombings. The interactive racing bar chart unveils tonnage variations across different target types, providing insights into the priorities and strategies that shaped military operations. Engage with the play and pause feature for a granular exploration of how tonnage levels evolved over time.
Interactive Racing Bar Chart for Tonnage Over Time (By Target Type)
Navigate the landscape of military involvement, identifying trends in operations conducted by different services. Trace the evolution of participation over time, providing a comprehensive view of military contributions.
bomb_ops_by_service = bombings_df['milservice'].value_counts(dropna=False).reset_index()
bomb_ops_by_service = bomb_ops_by_service.rename(columns={'index': 'milservice', 'milservice': 'count'})
bomb_ops_by_service = bomb_ops_by_service.loc[bomb_ops_by_service.milservice.isin(['USAF','USN','VNAF','USMC','RLAF','KAF','RAAF'])]
bomb_ops_by_service
milservice | count | |
---|---|---|
0 | USAF | 2813692 |
1 | USN | 694186 |
2 | VNAF | 634717 |
3 | USMC | 453996 |
4 | RLAF | 32779 |
5 | KAF | 24470 |
6 | RAAF | 12714 |
# Create a bar graph using Plotly Express
fig = px.bar(bomb_ops_by_service,
x='milservice',
y='count',
title='Number of Attacks by Military Forces',
labels={'Value': 'Count'})
fig.update_xaxes(tickmode='linear')
fig.update_layout(xaxis_title='Military Force', yaxis_title='Number of attacks')
# Add annotations with arrows
for i, row in bomb_ops_by_service.iterrows():
fig.add_annotation(
x=row['milservice'],
y=row['count'],
text=f"{row['count']}",
showarrow=True,
arrowhead=2,
arrowsize=1,
arrowwidth=2,
arrowcolor="#636363",
ax=0,
ay=-40,
)
# Show the plot
fig.show()
The spotlight now turns to the actors behind the operations. The bar chart offers a straightforward view of the military services involved, allowing us to identify the major contributors. As we move forward, we'll explore how the involvement of these military services evolved over the course of the war.
Bar Chart for Number of Attacks by Military Forces
Embark on a dynamic exploration of military service involvement with a racing bar chart. This animated visualization unveils the changing landscape of military contributions, allowing for a temporal analysis of how different services actively participated in the strikes throughout the Vietnam War.
bomb_ops_by_service_by_year = bombings_df[['msnyear', 'milservice']].value_counts(dropna=False).reset_index()
bomb_ops_by_service_by_year = bomb_ops_by_service_by_year.rename(columns={0: 'mil_count'})
# Create a pivot table to fill in missing values with 0
pivot_table1 = bomb_ops_by_service_by_year.pivot_table(index='msnyear', columns='milservice', values='mil_count', fill_value=0).reset_index()
# If needed, flatten the DataFrame
pivot_table1 = pivot_table1.melt(id_vars='msnyear', var_name='milservice', value_name='milcount')
# Sort the pivot table by 'msnyear' and 'total_tonnage'
pivot_table1 = pivot_table1.sort_values(by=['msnyear', 'milcount'], ascending=[True, False])
my_raceplot = barplot(pivot_table1,
item_column='milservice',
value_column='milcount',
time_column='msnyear',
top_entries = 10)
print('Animated Map: Use Play button at bottom to visualize with animation.')
my_raceplot.plot(title = 'Total Military Strikes by each Military Services per Year',
item_label='Military Service',
value_label='Strikes',
time_label = 'Year: ',
frame_duration=2000)
Animated Map: Use Play button at bottom to visualize with animation.
The story of military services comes alive through the racing bar chart. Watch as different services come to the forefront, reflecting the evolving strategies and priorities during different phases of the war. Engage with the play-pause button to dissect the temporal evolution of military service involvement.
Racing Bar Chart for Evolution of Military Service Involvement Over Time
Delve into the intricate web of military strategies using an interactive sunburst plot. This visualization enables users to dissect how the military services of each country targeted specific nations during the Vietnam War. The interactive elements provide a granular exploration, allowing for a comprehensive understanding of cross-border military operations.
sunburst_df = bombings_df.groupby([ 'countryflyingmission', 'milservice','tgtcountry']).size().reset_index(name='mission_count')
fig = px.sunburst(
sunburst_df,
path=[ 'countryflyingmission', 'milservice','tgtcountry'],
values='mission_count',
title="Distribution of Missions by Military Service on each Target",
color='mission_count', # Color based on complaint count
color_continuous_scale='Sunset', # Adjust the color scale
)
fig.update_layout(
margin=dict(t=60, l=15, r=15, b=15),
width=1000, # Set the width of the figure
height=800 # Set the height of the figure
)
print('Interactive Map: Click on each element to expand and contract the segments of sunburst chart')
fig.show()
Interactive Map: Click on each element to expand and contract the segments of sunburst chart
The dynamics of cross-border military operations are laid bare through the interactive sunburst plot. Each segment represents a military service, and as users click and explore, the intricate details of how these services targeted different countries unfold. This immersive experience provides insights into the interconnected strategies of various nations.
Interactive Sunburst Plot for Military Services Attacking Different Target Countries
Visualize operations, unraveling their primary objectives. Examine correlations between specific operations and pivotal historical events or policy changes, unraveling the threads of strategic decision-making.
# Calculate the number of missions for each weapon type
missions_count = bombings_df.groupby('operation_grp')['thor_data_viet_id'].count().reset_index()
missions_count = missions_count[missions_count['operation_grp'] != 'UNNAMED']
# Sort the DataFrame by mission count in descending order
missions_count = missions_count.sort_values(by='thor_data_viet_id', ascending=False)
# Select only the top 10 rows
top_10_missions = missions_count.head(10)
# Create a bar chart using Plotly Express
fig = px.bar(top_10_missions,
x='operation_grp',
y='thor_data_viet_id',
title='Number of Missions in Top-10 Operations',
labels={'operation_grp': 'Operation Name', 'thor_data_viet_id': 'Number of Missions'}
)
# Sort the x-axis categories by the number of missions
fig.update_layout(xaxis_categoryorder='total descending')
# Show the plot
fig.show()
Operations form the backbone of military strategies. The bar chart introduces us to the top-10 operations, offering a glimpse into their frequency and significance. As we delve deeper, we'll explore how these operations evolved over time, shedding light on the strategic decisions that shaped the course of the war.
Bar Chart for Number of Missions in Top-10 Operations
Navigate the landscape of various operations, capturing their evolution throughout the Vietnam War. Utilizing a racing bar chart, observe the dynamic shifts in the frequency of different operations, identifying key periods of strategic change. This visualization provides insights into the flow of military strategies, allowing for a nuanced understanding of how operational priorities shifted over time.
operations_by_year = bombings_df[['msnyear', 'operation_grp']].value_counts(dropna=False).reset_index()
operations_by_year = operations_by_year[operations_by_year['operation_grp'] != 'UNNAMED']
operations_by_year = operations_by_year.rename(columns={0: 'ops_count'})
# Create a pivot table to fill in missing values with 0
pivot_table2 = operations_by_year.pivot_table(index='msnyear',
columns='operation_grp',
values='ops_count',
fill_value=0).reset_index()
# If needed, flatten the DataFrame
pivot_table2 = pivot_table2.melt(id_vars='msnyear', var_name='operation_grp', value_name='ops_count')
# Sort the pivot table
pivot_table2 = pivot_table2.sort_values(by=['msnyear', 'ops_count'], ascending=[True, False])
my_raceplot = barplot(pivot_table2,
item_column='operation_grp',
value_column='ops_count',
time_column='msnyear',
top_entries = 10)
print('Animated Map: Use Play button at bottom to visualize with animation.')
my_raceplot.plot(title = 'Top Operations per Year',
item_label='Operation Name',
value_label='Number of Operations',
time_label = 'Year: ',
frame_duration=2000)
Animated Map: Use Play button at bottom to visualize with animation.
The racing bar chart becomes a time machine, guiding us through the evolving landscape of operations. Watch as different missions take center stage, reflecting the ebb and flow of strategic priorities. Engage with the play-pause button for an interactive journey through the dynamic evolution of operations.
Racing Bar Chart for Evolution of Operations Over Time
Witness the evolution of weaponry and aircraft deployment. Unearth patterns in weapon choices and aircraft types, painting a vivid picture of technological advancements and strategic adaptations throughout the Vietnam War.
# Calculate the number of missions for each weapon type
missions_count = bombings_df.groupby('weapontype')['thor_data_viet_id'].count().reset_index()
# Sort the DataFrame by mission count in descending order
missions_count = missions_count.sort_values(by='thor_data_viet_id', ascending=False)
# Select only the top 10 rows
top_10_missions = missions_count.head(10)
# Create a bar chart using Plotly Express
fig = px.bar(top_10_missions,
x='weapontype',
y='thor_data_viet_id',
title='Number of Missions per Top-10 Weapons',
labels={'weapontype': 'Weapon Type', 'thor_data_viet_id': 'Number of Missions'}
)
# Sort the x-axis categories by the number of missions
fig.update_layout(xaxis_categoryorder='total descending')
# Show the plot
fig.show()
The arsenal of war takes center stage as we explore the top-10 weapons. The bar charts reveal the predominant choices, reflecting technological advancements and strategic adaptations.
Bar Chart for Number of Missions for Top-10 Weapons
# Calculate the number of missions for each aircraft
missions_count = bombings_df.groupby('aircraft_name')['thor_data_viet_id'].count().reset_index()
# Sort the DataFrame by mission count in descending order
missions_count = missions_count.sort_values(by='thor_data_viet_id', ascending=False)
# Select only the top 10 rows
top_10_missions = missions_count.head(10)
# Create a bar chart using Plotly Express
fig = px.bar(top_10_missions,
x='aircraft_name',
y='thor_data_viet_id',
title='Number of Missions per Top-10 Aircraft',
labels={'aircraft_name': 'Aircrafts', 'thor_data_viet_id': 'Number of Missions'}
)
# Sort the x-axis categories by the number of missions
fig.update_layout(xaxis_categoryorder='total descending')
# Show the plot
fig.show()
The arsenal of war takes center stage as we explore the top-10 aircrafts. The bar charts reveal the predominant choices, reflecting technological advancements and strategic adaptations.
Bar Chart for Number of Missions for Top-10 Aircrafts
Witness the racing bar chart depicting the evolution of weapon choices over time. Track the rise and fall of different weapon types, providing a chronological perspective on the shifting preferences in the Vietnam War.
weapons_by_year = bombings_df[['msnyear', 'weapontype']].value_counts(dropna=False).reset_index()
weapons_by_year = weapons_by_year.rename(columns={0: 'mission_count'})
# Create a pivot table to fill in missing values with 0
pivot_table2 = weapons_by_year.pivot_table(index='msnyear',
columns='weapontype',
values='mission_count',
fill_value=0).reset_index()
# If needed, flatten the DataFrame
pivot_table2 = pivot_table2.melt(id_vars='msnyear', var_name='weapontype', value_name='mission_count')
# Sort the pivot table by 'msnyear' and 'total_tonnage'
pivot_table2 = pivot_table2.sort_values(by=['msnyear', 'mission_count'], ascending=[True, False])
my_raceplot = barplot(pivot_table2,
item_column='weapontype',
value_column='mission_count',
time_column='msnyear',
top_entries = 10)
print('Animated Map: Use Play button at bottom to visualize with animation.')
my_raceplot.plot(title = 'Number of Missions by Top-10 Weapons per Year',
item_label='Weapon Name',
value_label='Number of Missions',
time_label = 'Year: ',
frame_duration=2000)
Animated Map: Use Play button at bottom to visualize with animation.
The racing bar chart transforms into a visual timeline, illustrating the dynamic evolution of weapon choices. Each race represents a weapon's journey through time, offering insights into how strategic preferences shifted over the course of the war. Engage with the play-pause button to navigate through this chronological narrative.
Racing Bar Chart for Evolution of Weapons Over Time
Dive into the racing bar chart showcasing the evolution of aircraft choices throughout the war. Uncover the changing landscape of aircraft deployment, illustrating how technological advancements and strategic considerations influenced the selection of aircraft types over time.
aircraft_by_year = bombings_df[['msnyear', 'aircraft_name']].value_counts(dropna=False).reset_index()
aircraft_by_year = aircraft_by_year.rename(columns={0: 'mission_count'})
# Create a pivot table to fill in missing values with 0
pivot_table3 = aircraft_by_year.pivot_table(index='msnyear',
columns='aircraft_name',
values='mission_count',
fill_value=0).reset_index()
# If needed, flatten the DataFrame
pivot_table3 = pivot_table3.melt(id_vars='msnyear', var_name='aircraft_name', value_name='mission_count')
# Sort the pivot table by 'msnyear' and 'total_tonnage'
pivot_table3 = pivot_table3.sort_values(by=['msnyear', 'mission_count'], ascending=[True, False])
my_raceplot = barplot(pivot_table3,
item_column='aircraft_name',
value_column='mission_count',
time_column='msnyear',
top_entries = 10)
print('Animated Map: Use Play button at bottom to visualize with animation.')
my_raceplot.plot(title = 'Number of Missions by Top-10 Aircrafts per Year',
item_label='Aircraft Name',
value_label='Number of Missions',
time_label = 'Year: ',
frame_duration=2000)
Animated Map: Use Play button at bottom to visualize with animation.
The skies above Vietnam tell a story of technological prowess and strategic evolution. The racing bar chart for aircrafts unfolds the narrative of changing preferences and technological adaptations. Engage with the play-pause button to witness the chronological journey of aircraft choices, providing a unique perspective on the war's aerial dimension.
Racing Bar Chart for Evolution of Aircrafts Over Time
Our approach to visualizing the intricate history of the Vietnam War bombings is rooted in a strategic blend of clarity, historical context, and interactive engagement, ensuring a comprehensive exploration of this complex narrative.
Geographical Distribution:
The journey begins with "Geographical Distribution," employing an animated scatter map to dynamically capture the evolving patterns of bombings across Vietnam and its neighboring regions. The design choice of an additional static map, paralleled with the animated view, strategically provides a consolidated visualization of total bombings in each target country. This dual-mapping approach enhances the audience's understanding of the spatial dynamics while facilitating a comparative analysis.
Temporal Patterns:
"Temporal Patterns" unfolds with a bar graph illustrating yearly bombing strikes, offering a high-level temporal overview. The subsequent line charts dive deeper into specific temporal patterns, such as notable spikes or lulls in bombing intensity by countries, and the dynamic involvement of allied nations, with a skillful exclusion of the USA from the analysis. An interactive line chart correlates temporal patterns with major historical events, providing users with a personalized exploration of the data's temporal narrative. These design choices empower users to unravel the chronological nuances of the Vietnam War.
Tonnage:
In the "Tonnage" section, a bubble chart vividly emphasizes the magnitude of bombs dropped, offering an immediate grasp of the scale of the conflict. Follow-up questions are addressed through a bar graph showcasing yearly tonnage, allowing users to discern periods of escalation or de-escalation. Interactive racing bar charts are introduced to explore tonnage by target country and tonnage by target type, providing granular insights into tonnage variations over time and across different contexts. These visualizations not only capture the sheer volume of bombings but also reveal nuanced patterns in targeting strategies.
Military Services:
The exploration of "Military Services" begins with bar charts revealing the number of attacks by different military forces. This sets the stage for a dynamic exploration of military service involvement through a racing bar chart. The subsequent introduction of an interactive sunburst plot dissects how military services of each country targeted different nations. This multi-layered approach offers users a detailed understanding of the evolving dynamics among military services throughout the war, fostering a nuanced comprehension of their contributions.
Operations:
"Operations" is unfolded through bar charts revealing the number of missions in top-10 operations. A racing bar chart follows, portraying the evolution of different operations over time. These visualizations allow users to identify trends or patterns in the types of operations conducted, unraveling the threads of strategic decision-making. The design choices aim to provide a comprehensive view of operational priorities and shifts in strategic focus.
Weapons and Aircrafts:
The exploration of "Weapons and Aircrafts" is facilitated by bar charts and racing bar charts, depicting the number of missions for top-10 weapons and aircraft. These visualizations offer insights into the evolution of weaponry and aircraft deployment, painting a vivid picture of technological advancements and strategic adaptations over the course of the war. The design choices ensure that users can witness the changing landscape of weapon and aircraft choices with a chronological perspective.
This diverse set of visualizations is meticulously crafted to cater to both seasoned analysts and those approaching the Vietnam War history for the first time. The interplay of static and interactive elements fosters a nuanced and engaging exploration, allowing users to uncover the multifaceted dimensions of this pivotal historical event.
Our project's development was meticulously structured, adhering to a well-defined timeline and efficient task distribution.
This project is developed by Satyam Shrivastava and Pritish Arora, both played a pivotal role in the successful execution of this project. We both are Data Science graduate students at Khoury College of Computer Sciences - Northeastern University.
Week 11: Data Acquisition
During this phase, we collaboratively searched for a suitable dataset, ensuring it aligns with our project goals. After thorough exploration, we agreed upon a dataset that provided comprehensive insights into the Vietnam War bombings.
Week 12: Data Analysis
With the dataset at hand, we independently conducted Exploratory Data Analysis (EDA) to gain a nuanced understanding of the information available. This step allowed us to identify potential visualization questions and lay the groundwork for our project's narrative.
Week 13: Peer Review
After completing the initial analysis, we shared our work in progress for the peer review process.
Week 14: Feedback Implementation
Building upon the received feedback, we implemented necessary improvements to enhance the clarity, coherence, and effectiveness of our visualizations. This phase involved refining visual encodings, addressing design considerations, and ensuring the overall flow of the project.
Week 15: Final Project Completion and Validation
In the concluding week, we collaborated to combine our individual analyses into a cohesive visualization story. This collaborative effort involved synthesizing insights, refining the narrative, and ensuring a seamless flow between different sections of the project. The final validation phase involved confirming the accuracy of visualizations, checking for any inconsistencies, and preparing the project for publication.
Data Acquisition and Initial Analysis:
We jointly searched for and decided on the dataset, identified project goals, and conducted independent Exploratory Data Analysis (EDA) to formulate initial visualization questions.
Visualization Question Formulation:
The six visualization questions were divided equally between Satyam and Pritish, with each team member taking ownership of three questions. This approach ensured a balanced distribution of tasks and allowed for focused exploration.
Peer Review and Feedback Implementation:
Both team members actively participated in the peer review process, providing valuable insights and suggestions for improvement. The feedback received was then collaboratively implemented to refine the visualizations.
Visualization Story and Publication:
The final stage involved combining our work to create a cohesive visualization story. We collaborated on writing the narrative, ensuring a smooth transition between different sections, and published the project for sharing with our target audience.
Python served as the primary programming language for our analysis, with Jupyter Notebooks providing an interactive and integrated workspace. Pandas was instrumental for data manipulation and preprocessing, facilitating tasks such as cleaning, transformation, and loading. For visualization, we leveraged the capabilities of Matplotlib and Plotly to create a diverse range of static and interactive plots, charts, graphs, and maps. These advanced visualization tools, combined with the skills acquired during our course, enabled us to craft a compelling and informative narrative, elevating the overall quality of our project.
This systematic and collaborative approach, coupled with the effective use of tools and a well-structured timeline, contributed to the successful development of our Vietnam War bombings visualization project.
The received feedback from peers has been instrumental in refining and enhancing the Vietnam bombing project's visualizations, providing valuable insights that significantly contributed to the project's overall impact.
Visual Encodings:
The sunburst plot received praise for effectively showcasing the breakdown of military services by country. The animated choropleth was acknowledged for dynamically illustrating the changing geographical distribution of bombing incidents over time.
Addressing improvement suggestions, a time series element was incorporated into the bar plot representing military services, enabling users to visualize the evolution of different military services' involvement over the course of the war.
The suggestion to delve deeper into the data by analyzing the breakdown of aircraft and weapons used by the top-10 contributors was implemented, offering additional insights into the bombing campaign's tactics.
While data on the types of targets bombed wasn't initially included, the feedback sparked consideration for its addition, recognizing its potential to add another layer of understanding to the visualization.
The recommendation to create a narrative alongside the visualizations was embraced, aligning historical events and key statistics to enhance the storytelling potential and provide context for viewers.
Design Quality:
Acknowledging the clear titles, labels, and legends enhancing visualization clarity, credibility validation with reliable sources was suggested and implemented.
The idea of adding interactive features, such as tooltips, was recognized as a potential enhancement for user engagement and information accessibility.
The recommendation to integrate historical events into the visualizations through text annotations, timeline representations, or interactive elements was welcomed and adopted to create a more cohesive and insightful story about the Vietnam War bombings.
Visual Encodings:
The animated map's effectiveness in dynamically unfolding the geographical distribution of bombing incidents over time was highlighted. The utilization of bar graphs and line charts for communicating temporal patterns and tonnage was commended.
The idea of incorporating temporal patterns linked to significant operations through a racing bar chart animation was acknowledged, contributing to a richer visualization that contextualizes historical events.
Considering alternative visual encodings, such as heatmaps or density maps, and implementing interactive elements like filters for time periods or target countries were recognized as potential avenues for further exploration. However, heatmaps and density maps were taking significant memory of the platform due to data size, making it slow, so it was discarded.
Design Quality:
The overall design's impressive guidance through various aspects of the bombing campaign was noted. The structured organization and the inclusion of appropriate legends, titles, and axis names were acknowledged.
The suggestion for improved consistency and cohesion in color mapping across all charts was addressed, creating a visually unified and aesthetically pleasing display.
Integrating visual elements like annotations or highlights to denote significant events was seen as a valuable addition to enrich the visualization.
In conclusion, the integration of peer feedback has significantly elevated the Vietnam bombing project's visualizations, ensuring a more comprehensive, engaging, and insightful exploration of the historical context and repercussions of the bombing campaign. The collaborative input has not only improved the visual appeal but also deepened the project's value as a resource for understanding the complexities of the Vietnam War.
Embarking on the Vietnam bombing project has been an enlightening journey, providing invaluable insights into the intricacies of data visualization and storytelling. As I reflect on this learning experience, several key takeaways and considerations come to the forefront.
The project offered a hands-on opportunity to apply theoretical concepts learned throughout the course, transforming abstract ideas into tangible, impactful visualizations. Navigating the complexities of the Vietnam War dataset underscored the importance of thoughtful design choices, effective data analysis, and the power of storytelling through visuals.
One of the most significant learnings was the iterative nature of the data visualization process. The incorporation of peer feedback played a pivotal role in refining the visualizations, highlighting the collaborative and dynamic nature of this field. Recognizing the impact of feedback on the project's depth and clarity emphasized the importance of seeking diverse perspectives for a more comprehensive outcome.
The utilization of various visualization techniques, such as animated maps, racing bar charts, and interactive elements, deepened my understanding of the diverse tools available for conveying complex information. This project underscored the significance of not only choosing the right visual encodings but also incorporating interactive features to enhance user engagement and comprehension.
In future projects, a more structured approach to incorporating historical context from the outset could be considered. While the narrative was eventually enriched with historical events, weaving them seamlessly into the initial stages of the project could provide a more cohesive storyline from the start.
Moreover, exploring the integration of additional data sources could enhance the depth and context of future projects. Merging data from other wars or contrasting it with contrasting datasets could provide a comparative perspective, unraveling patterns and insights that might not be apparent in isolation.
Additionally, investigating alternative visual encodings and interactive features at an early stage might offer a broader spectrum of insights. The awareness that different visual representations could unveil nuances not immediately apparent in the chosen visualizations could be integrated into the initial design considerations.
Furthermore, expanding the scope to include a multi-dimensional analysis, such as the social, economic, or political impacts of bombing campaigns, could contribute to a more holistic understanding of historical events. Integrating diverse dimensions could add layers of complexity to the visual narratives, providing a richer and more nuanced exploration.
Overall, this project has been a rewarding exploration into the fusion of data, technology, and storytelling. The iterative process, coupled with the collaborative nature of feedback incorporation, has not only refined my technical skills but also deepened my appreciation for the art and science of data visualization. As I move forward, these insights will undoubtedly shape my approach to future projects, fostering a commitment to continual learning, adaptability, and a holistic understanding of the narratives behind the data.
This project would not have been possible without the generous contribution of various data sources, tools, and inspirations that fueled our exploration into the complexities of the Vietnam War bombings. We extend our sincere gratitude to those who played a pivotal role in shaping this project.
First and foremost, we express our appreciation to Theater History of Operations (THOR): the creators and maintainers of the Vietnam War Database, the primary data source for this project. The meticulous curation and documentation of this dataset laid the foundation for our analysis, providing a comprehensive and reliable resource for understanding the historical context of the Vietnam War bombings.
Our approach to data visualization was greatly influenced by the teachings and insights gained from the data visualization course, and we acknowledge the valuable guidance provided by our instructor: Dr. Lace Padilla and TA: Harshini Chandrika Dasri. The knowledge imparted throughout the course empowered us to navigate the nuances of visual storytelling and effectively communicate complex information.
We would also like to thank our peers who actively participated in the peer review process. Their constructive feedback and thoughtful critiques played a crucial role in refining our visualizations and elevating the overall quality of the project. The collaborative spirit within the learning community significantly contributed to the iterative development of our visual narrative.
The tools and libraries employed in this project deserve acknowledgment for their role in translating our ideas into compelling visualizations. We express our gratitude to Python, Jupyter Notebook, Matplotlib, Plotly, and Pandas for providing a robust and flexible environment for data analysis and visualization.
Lastly, we draw inspiration from the broader community of data enthusiasts, researchers, and storytellers who continually push the boundaries of data visualization. The collective efforts of this community inspire us to explore innovative approaches and strive for excellence in our own work.
References:
In conclusion, we extend our thanks to all individuals and resources that have played a part in the realization of this project. Your contributions have enriched our learning experience and have been instrumental in the development of this visual narrative.