Bombs Over Vietnam: A Visual Exploration of Historical Vietnam War¶

Final Project

CS 7250: Information Visualization: Theory and Applications
Dr. Lace Padilla

Group Members:
Satyam Shrivastava
Pritish Arora

(Due) Dec 12, 2023

1. Introduction¶

Visualization is not merely a tool for aesthetic representation; it is a medium through which we can distill intricate historical narratives into compelling stories that resonate with a broad audience.

In embarking on our data visualization journey, we have chosen to delve into the profound and impactful domain of the Vietnam War. Our commitment to this project stems not only from its historical significance but also from the inherent power of visualizations to shed light on complex issues and stimulate meaningful conversations.

In the subsequent sections, we will detail the dataset we have chosen, why we have chosen it, goals from this project, the questions that will guide our exploration, the tools we plan to employ, and our vision for presenting answers through visualizations.

2. Topic & Dataset¶

In our exploration of the Vietnam War, we've chosen to focus specifically on the bombings that occurred during this intense period in history. The intensity and scale of the bombings in Vietnam have left an indelible mark on the landscape and the collective memory of the people. By delving into this aspect, we aim to unravel the layers of historical data, providing a comprehensive view of the strategic and operational dynamics that shaped this conflict.

Dataset:¶

Vietnam War THOR Data

This dataset about records of bombings in Vietnam War is meticulously curated by Theater History of Operations (THOR), as a compilation of historic aerial bombings spanning World War I through Vietnam. With over 4.8 million rows detailing each bombing run, THOR is a valuable resource that has not only aided in locating unexploded ordnance in Southeast Asia but has also contributed to refining Air Force combat tactics. Despite the inherent challenges in the data, such as duplicated sorties and non-standardized mission/operation naming, THOR presents an unparalleled opportunity to analyze and visualize the patterns and impact of Vietnam War bombings.

Data Dictionary can be found here

Importance of the Topic:¶

The Vietnam War bombings dataset holds immense historical significance, offering a window into a pivotal period marked by conflict and complex military operations. The importance of this dataset extends beyond its historical value and its understanding is crucial for several reasons:

  • Historical Significance: Bombings played a pivotal role in the course of the Vietnam War, impacting both military strategy and civilian life.
  • Humanitarian Impact: The data reflects the human cost of war, prompting us to reflect on the consequences and lessons learned.
  • Military Tactics and Innovation: The THOR dataset has proven instrumental in finding unexploded ordnance and improving Air Force combat tactics, showcasing its practical importance.

Goals from this Project:¶

  • Educational Impact:
    • Objective: To provide a nuanced understanding of the Vietnam War with best visualization practices learned in the course.
    • Approach: Through visualizations, we aim to present key data points and historical events, offering and gaining an educational experience that goes beyond textbooks.
  • Prompting Reflection:
    • Objective: To stimulate reflection and discussion about the broader implications of war.
    • Approach: Our visualizations aspire to serve as a catalyst for conversations, prompting contemplation on the human cost of war and the enduring lessons from history.
  • Storytelling through Visualization:
    • Objective: To harness the power of storytelling in driving change.
    • Approach: Through thoughtful design choices and visualization techniques, we intend to craft a narrative that transcends mere data points, conveying the emotions and narratives that define the Vietnam War.

Through our exploration of the Vietnam War bombings, we aim to bridge the gap between historical data and contemporary understanding, showcasing the best visualization practices.

3. Visualization Questions: Breadth and Depth of Analysis¶

In this section, we embark on a journey of exploration into the Vietnam War bombings dataset, outlining key questions that aim to provide both a broad overview and a deep understanding of the historical events captured within. This dual approach, encompassing breadth and depth, serves as an exploratory data analysis (EDA) and lays the foundation for framing a compelling visualization story.

Breadth of Analysis - Initial Questions:¶

Our initial set of questions aims to cast a wide net, capturing the overarching patterns and characteristics of the Vietnam War bombings:

  1. Geographical Distribution: How is the bombing activity distributed geographically across Vietnam?
  2. Temporal Patterns: What is the timeline of bombing activities throughout the Vietnam War?
  3. Tonnage: What is the tonnage of bombs dropped throughout the war?
  4. Military Services: Which military services were actively involved in the bombings?
  5. Operations: Can we identify trends or patterns in the types of operations conducted?
  6. Weapons & Aircrafts: What types of weapons and aircrafts were predominantly used during the bombings?

Depth of Analysis - Follow-Up Questions:¶

Building upon the insights gained from the initial breadth of analysis, our follow-up questions aim to delve deeper into specific aspects, unraveling the intricacies of the Vietnam War bombings:

  1. Geographical Distribution: How is the bombing activity distributed geographically across Vietnam?
    • How does the geographical distribution change over the war period?

  2. Temporal Patterns: What is the timeline of bombing activities throughout the Vietnam War?
    • Are there notable spikes or lulls in bombing intensity during specific periods by countries involved?
    • How allied countries has participated in the bombing activities over the period?
    • How do temporal patterns correlate with major historical events?

  3. Tonnage: What is the tonnage of bombs dropped throughout the war?
    • Explore the Tonnage of Bombs dropped over the years.
    • Explore the Tonnage of Bombs dropped on each Target Country over the years.
    • Explore the Tonnage of Bombs dropped on each Target Type over the years.

  4. Military Services: Which military services were actively involved in the bombings?
    • How does the involvement of different military services evolve over time?
    • Explore how military services of each country attacked different target countries in war.

  5. Operations: Can we identify trends or patterns in the types of operations conducted?
    • How does different Operations evolve over time during war period?

  6. Weapons & Aircrafts: What types of weapons and aircrafts were predominantly used during the bombings?
    • How does the choice of weapons evolve over the course of the war?
    • How does the choice of aircrafts evolve over the course of the war?

This dual approach, encompassing both breadth and depth of analysis, lays the groundwork for our visualization story. By systematically addressing these questions, we aim to not only uncover historical trends but also to shape a narrative that resonates with the nuances and complexities of the Vietnam War bombings. The visualization story that emerges from this process will serve as a powerful tool for education, reflection, and, ultimately, meaningful conversation.

4. Tools for Analysis¶

In this analysis, we employ Python as the primary programming language, along with its associated libraries for data processing, visualization, and analysis. The Jupyter Notebook environment serves as our interactive workspace, seamlessly integrating code, visualizations, and explanatory text.

For data manipulation and preprocessing, we rely on Pandas, which facilitates tasks such as data loading, cleaning, and transformation. In terms of visualization, we utilize Matplotlib and Plotly to create a diverse range of static and interactive plots, charts, graphs, and interactive maps. These libraries empower us to present our insights effectively, ensuring clarity and impact.

These advanced visualization tools, coupled with the concepts learned throughout this course, will elevate our storytelling capabilities, offering a richer and more immersive experience for our audience.

In [1]:
# Import the required libraries
import pandas as pd
pd.set_option('display.max_rows', None)
pd.set_option('display.max_columns', 100)
import numpy as np
import matplotlib.pyplot as plt
import plotly.express as px
import plotly.graph_objects as go
import plotly.io as pio
pio.renderers.default = "notebook_connected"
from raceplotly.plots import barplot
import warnings
warnings.filterwarnings("ignore")
class color:
   PURPLE = '\033[95m'
   CYAN = '\033[96m'
   DARKCYAN = '\033[36m'
   BLUE = '\033[94m'
   GREEN = '\033[92m'
   YELLOW = '\033[93m'
   RED = '\033[91m'
   BOLD = '\033[1m'
   UNDERLINE = '\033[4m'
   END = '\033[0m'

5. Data Quality & Transformations¶

Overview:

The data set comprises three main files: Bombing Operations (Fact table), Aircraft Glossary (Dim Table), and Weapons Glossary (Dim table). This combination contains 4.8 million rows, detailing each bombing run with information such as operation details, aircraft used, weapons deployed, and target coordinates.

Data Issues:

  • Duplicated Sorties: Duplicates exist due to data updates from multiple sources. Managing these duplicates is crucial to ensure accurate analysis.
  • Non-Standardized Operation/Mission Naming: Efforts have been made to standardize mission names, but there's uncertainty about the correctness of the standardization due to limited historical and military knowledge.
  • Incomplete Vietnam Bombing Data: Data is incomplete before September 30th, 1965, posing a challenge for comprehensive analysis. Some columns have unclear meanings due to ongoing data structure improvements.

Data Quality & Transformations:

  • Load Datasets: Bombing Operations, Aircraft Glossary, and Weapons Glossary datasets have been loaded.
  • Standardize Column Names: Column names across datasets have been standardized to lowercase for consistency.
  • Check Data Dimensions: The Bombing Operations dataset has 4,670,416 rows and 47 columns, Aircraft Glossary has 104 rows and 8 columns, and Weapons Glossary has 294 rows and 6 columns.
  • Handle Invalid Dates: Corrected an invalid date in the msndate column.
  • Handle Missing Values: Analyzed and reported missing values in the Bombing Operations dataset.
  • Drop Duplicate Rows: Removed duplicate rows from the Aircraft Glossary dataset based on 'validated_root' and 'aircraft_name.'
  • Merge Datasets: Merged Bombing Operations with Aircraft Glossary and Weapons Glossary using appropriate keys.
  • Create Time-Related Columns: Converted 'msndate' to datetime format and created additional time-related columns ('msnyear', 'msnyearmonth', 'msnmonthname').
  • Harmonize Operation Names: Created 'operation_grp' field to harmonize operation names.

Data Quality Summary:

  • Percentage of missing values varies across columns, ranging from 0.00% to 100.00%.
  • Cleaning efforts include handling duplicates, standardizing dates, and merging datasets for a comprehensive analysis.
  • Further exploration and domain expertise are needed to address non-standardized mission names and incomplete data issues.
In [2]:
# Load the datasets from CSV files - make sure data folder containing CSV files is in the same folder as Notebook

# Bombing Operations
Bomb_Ops_df = pd.read_csv('./thor-vietnam-war-data/thor_data_vietnam.csv', 
                          encoding='ISO-8859-1', low_memory=False)

# Aircraft Glossary
Air_Gloss_df = pd.read_csv('./thor-vietnam-war-data/THOR_VIET_AIRCRAFT_GLOSS.csv', 
                           encoding='ISO-8859-1')

# Weapons Glossary
Wpn_Gloss_df = pd.read_csv('./thor-vietnam-war-data/THOR_VIET_WEAPON_GLOSS.csv', 
                           encoding='ISO-8859-1')
In [3]:
# Standardize column names to lowercase for Bomb_Ops_df
Bomb_Ops_df.columns = Bomb_Ops_df.columns.str.lower()
In [4]:
# Standardize column names to lowercase for Wpn_Gloss_df
Air_Gloss_df.columns = Air_Gloss_df.columns.str.lower()
In [5]:
# Standardize column names to lowercase for Wpn_Gloss_df
Wpn_Gloss_df.columns = Wpn_Gloss_df.columns.str.lower()
In [6]:
# Check the data dimensions - number of rows and columns
print(f'Bomb_Ops_df: {Bomb_Ops_df.shape[0]} rows, and {Bomb_Ops_df.shape[1]} columns')
print(f'Air_Gloss_df: {Air_Gloss_df.shape[0]} rows, and {Air_Gloss_df.shape[1]} columns')
print(f'Wpn_Gloss_df: {Wpn_Gloss_df.shape[0]} rows, and {Wpn_Gloss_df.shape[1]} columns')
Bomb_Ops_df: 4670416 rows, and 47 columns
Air_Gloss_df: 104 rows, and 8 columns
Wpn_Gloss_df: 294 rows, and 6 columns
In [7]:
# Rename specific columns
Bomb_Ops_df.rename(columns={'tgtlatdd_ddd_wgs84': 'tgt_latitude', 
                            'tgtlonddd_ddd_wgs84': 'tgt_longitude'}, 
                   inplace=True)
In [8]:
# Replace values in msndate column - 19700229 is not a valid date - 1970 was not a leap year
Bomb_Ops_df.loc[Bomb_Ops_df['msndate'] == "19700229", 'msndate'] = "19700228"
In [9]:
Bomb_Ops_df.head(10)
Out[9]:
thor_data_viet_id countryflyingmission milservice msndate sourceid sourcerecord valid_aircraft_root takeofflocation tgt_latitude tgt_longitude tgttype numweaponsdelivered timeontarget weapontype weapontypeclass weapontypeweight aircraft_original aircraft_root airforcegroup airforcesqdn callsign flthours mfunc mfunc_desc missionid numofacft operationsupported periodofday unit tgtcloudcover tgtcontrol tgtcountry tgtid tgtorigcoords tgtorigcoordsformat tgtweather additionalinfo geozone id mfunc_desc_class numweaponsjettisoned numweaponsreturned releasealtitude releasefltspeed resultsbda timeofftarget weaponsloadedweight
0 351 UNITED STATES OF AMERICA USAF 1971-06-05 647464 SEADAB EC-47 TAN SON NHUT NaN NaN NaN 0 1005.0 NaN NaN 0 EC47 EC47 NaN NaN STEEL 5 70 34.0 RADIO DIRECT FINDER 2624 1 NaN D 360TEW NaN NaN CAMBODIA NaN NaN NaN NaN UNIT: 360TEW - CALLSIGN: STEEL 5 NaN 27135863 NONKINETIC -1 -1 NaN NaN NaN 1005.0 0
1 2 UNITED STATES OF AMERICA USAF 1972-12-26 642778 SEADAB EC-47 NAKHON PHANOM NaN NaN NaN 0 530.0 NaN NaN 0 EC47 EC47 NaN NaN BARON 6 0 74.0 EXTRACTION (GPES) 2909 1 NaN D 361TEW NaN NaN SOUTH VIETNAM NaN NaN NaN NaN UNIT: 361TEW - CALLSIGN: BARON 6 NaN 27131177 NONKINETIC -1 -1 NaN NaN NaN 530.0 0
2 3 UNITED STATES OF AMERICA USAF 1973-07-28 642779 SEADAB RF-4 UDORN AB NaN NaN NaN 0 730.0 NaN NaN 0 RF4 RF4 NaN NaN ATLANTA 30 18.0 VISUAL RECCE 3059 1 NaN D 432TRW NaN NaN LAOS NaN NaN NaN NaN UNIT: 432TRW - CALLSIGN: ATLANTA NaN 27131178 NONKINETIC -1 -1 NaN NaN NaN 730.0 0
3 4 UNITED STATES OF AMERICA USAF 1970-02-02 642780 SEADAB A-1 NAKHON PHANOM 16.902500 106.014166 TRUCKS 2 1415.0 BLU27 FIRE BOMB (750) NaN 750 A1 A1 NaN NaN FF32 68 1.0 STRIKE 1047 2 NaN N 56SOW NaN NaN LAOS NaN 165409N1060051E DDMMSSN DDDMMSSE NaN UNIT: 56SOW - CALLSIGN: FF32 XE 27131179 KINETIC -1 -1 NaN NaN SECONDARY FIRE 1415.0 17400
4 5 VIETNAM (SOUTH) VNAF 1970-10-08 642781 SEADAB A-37 DANANG 14.945555 108.257222 BASE CAMP AREA 0 1240.0 NaN NaN 0 A37 A37 NaN NaN TIGER 41 28 5.0 CLOSE AIR SUPPORT B542 2 NaN D 516FS NaN NaN SOUTH VIETNAM NaN 145644N1081526E DDMMSSN DDDMMSSE NaN UNIT: 516FS - CALLSIGN: TIGER 41 ZB 27131180 KINETIC -1 -1 NaN NaN RNO WEATHER 1240.0 0
5 6 UNITED STATES OF AMERICA USAF 1970-11-25 642782 SEADAB F-4 UBON AB 19.602222 103.597222 AAA\37MM CR MORE 6 650.0 MK 82 GP BOMB (500) LD NaN 500 F4 F4 NaN NaN JASPER 57 1.0 STRIKE 1407 2 NaN D 8TFW NaN NaN LAOS NaN 193608N1033550E DDMMSSN DDDMMSSE NaN UNIT: 8TFW - CALLSIGN: JASPER UG 27131181 KINETIC -1 -1 NaN NaN DAMAGED 650.0 31860
6 7 UNITED STATES OF AMERICA USN 1972-03-08 642783 SEADAB A-4 TONKIN GULF 14.573611 106.689722 TRUCKS 0 1005.0 NaN NaN 0 A4 A4 NaN NaN CD H 16 1.0 STRIKE 9064 2 NaN D 775CTG NaN NaN LAOS NaN 143425N1064123E DDMMSSN DDDMMSSE NaN UNIT: 775CTG - CALLSIGN: CD H XB 27131182 KINETIC -1 -1 NaN NaN RNO NONVISUAL 1005.0 0
7 8 UNITED STATES OF AMERICA USAF 1971-12-27 642784 SEADAB F-4 UDORN AB NaN NaN NaN 0 0.0 NaN NaN 0 F4 F4 NaN NaN FALCON80 0 NaN NaN 7661 2 NaN NaN 432TRW NaN NaN LAOS NaN NaN NaN NaN UNIT: 432TRW - CALLSIGN: FALCON80 NaN 27131183 NONKINETIC -1 -1 NaN NaN NaN 0.0 0
8 9 UNITED STATES OF AMERICA USN 1972-05-24 642785 SEADAB A-7 TONKIN GULF NaN NaN NaN 0 0.0 NaN NaN 0 A7 A7 NaN NaN CD CS 0 NaN NaN 9205 4 NaN NaN 776CTG NaN NaN NORTH VIETNAM NaN NaN NaN NaN UNIT: 776CTG - CALLSIGN: CD CS NaN 27131184 NONKINETIC -1 -1 NaN NaN NaN 0.0 0
9 10 UNITED STATES OF AMERICA USAF 1972-09-12 642786 SEADAB EC-47 TAN SON NHUT NaN NaN NaN 0 710.0 NaN NaN 0 EC47 EC47 NaN NaN LEGMAN59 70 34.0 RADIO DIRECT FINDER 2618 1 NaN D 360TEW NaN NaN SOUTH VIETNAM NaN NaN NaN NaN UNIT: 360TEW - CALLSIGN: LEGMAN59 NaN 27131185 NONKINETIC -1 -1 NaN NaN NaN 710.0 0
In [10]:
Bomb_Ops_df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4670416 entries, 0 to 4670415
Data columns (total 47 columns):
 #   Column                Dtype  
---  ------                -----  
 0   thor_data_viet_id     int64  
 1   countryflyingmission  object 
 2   milservice            object 
 3   msndate               object 
 4   sourceid              int64  
 5   sourcerecord          object 
 6   valid_aircraft_root   object 
 7   takeofflocation       object 
 8   tgt_latitude          float64
 9   tgt_longitude         float64
 10  tgttype               object 
 11  numweaponsdelivered   int64  
 12  timeontarget          float64
 13  weapontype            object 
 14  weapontypeclass       float64
 15  weapontypeweight      int64  
 16  aircraft_original     object 
 17  aircraft_root         object 
 18  airforcegroup         object 
 19  airforcesqdn          object 
 20  callsign              object 
 21  flthours              int64  
 22  mfunc                 object 
 23  mfunc_desc            object 
 24  missionid             object 
 25  numofacft             int64  
 26  operationsupported    object 
 27  periodofday           object 
 28  unit                  object 
 29  tgtcloudcover         object 
 30  tgtcontrol            object 
 31  tgtcountry            object 
 32  tgtid                 object 
 33  tgtorigcoords         object 
 34  tgtorigcoordsformat   object 
 35  tgtweather            object 
 36  additionalinfo        object 
 37  geozone               object 
 38  id                    int64  
 39  mfunc_desc_class      object 
 40  numweaponsjettisoned  int64  
 41  numweaponsreturned    int64  
 42  releasealtitude       float64
 43  releasefltspeed       float64
 44  resultsbda            object 
 45  timeofftarget         float64
 46  weaponsloadedweight   int64  
dtypes: float64(7), int64(10), object(30)
memory usage: 1.6+ GB
In [11]:
Bomb_Ops_df.isnull().sum()
Out[11]:
thor_data_viet_id             0
countryflyingmission       3615
milservice                 3249
msndate                       0
sourceid                      0
sourcerecord                  0
valid_aircraft_root           0
takeofflocation            4971
tgt_latitude            1130131
tgt_longitude           1130131
tgttype                 1830425
numweaponsdelivered           0
timeontarget              26429
weapontype              2403497
weapontypeclass         4670416
weapontypeweight              0
aircraft_original           482
aircraft_root               482
airforcegroup           4667508
airforcesqdn            4667634
callsign                3300321
flthours                      0
mfunc                    101111
mfunc_desc               104722
missionid                 15670
numofacft                     0
operationsupported      1920049
periodofday              199764
unit                        493
tgtcloudcover           2440248
tgtcontrol              2150583
tgtcountry               216774
tgtid                   4670381
tgtorigcoords           1068209
tgtorigcoordsformat     1092870
tgtweather              2592711
additionalinfo                0
geozone                 1168052
id                            0
mfunc_desc_class              0
numweaponsjettisoned          0
numweaponsreturned            0
releasealtitude         4667038
releasefltspeed         4668727
resultsbda              4385020
timeofftarget             26429
weaponsloadedweight           0
dtype: int64
In [12]:
# Calculate the percentage of null values in each column
(Bomb_Ops_df.isnull().mean() * 100).round(2)
Out[12]:
thor_data_viet_id         0.00
countryflyingmission      0.08
milservice                0.07
msndate                   0.00
sourceid                  0.00
sourcerecord              0.00
valid_aircraft_root       0.00
takeofflocation           0.11
tgt_latitude             24.20
tgt_longitude            24.20
tgttype                  39.19
numweaponsdelivered       0.00
timeontarget              0.57
weapontype               51.46
weapontypeclass         100.00
weapontypeweight          0.00
aircraft_original         0.01
aircraft_root             0.01
airforcegroup            99.94
airforcesqdn             99.94
callsign                 70.66
flthours                  0.00
mfunc                     2.16
mfunc_desc                2.24
missionid                 0.34
numofacft                 0.00
operationsupported       41.11
periodofday               4.28
unit                      0.01
tgtcloudcover            52.25
tgtcontrol               46.05
tgtcountry                4.64
tgtid                   100.00
tgtorigcoords            22.87
tgtorigcoordsformat      23.40
tgtweather               55.51
additionalinfo            0.00
geozone                  25.01
id                        0.00
mfunc_desc_class          0.00
numweaponsjettisoned      0.00
numweaponsreturned        0.00
releasealtitude          99.93
releasefltspeed          99.96
resultsbda               93.89
timeofftarget             0.57
weaponsloadedweight       0.00
dtype: float64
In [13]:
# Assuming Air_Gloss_df is your DataFrame
Air_Gloss_df = Air_Gloss_df.drop_duplicates(subset=['validated_root', 'aircraft_name'])
In [14]:
Air_Gloss_df.head(10)
Out[14]:
gloss_id validated_root aircraft_name website_link aircraft_type aircraft_shortname aircraft_application ac_mission_count
0 1 A-1 Douglas A-1 Skyraider http://www.navalaviationmuseum.org/attractions... Fighter Jet Skyraider FIGHTER 373265
1 2 A-26 Douglas A-26 Invader http://www.militaryfactory.com/aircraft/detail... Light Bomber Invader BOMBER 36672
2 4 A-37 Cessna A-37 Dragonfly http://www.militaryfactory.com/aircraft/detail... Light ground-attack aircraft Dragonfly ATTACK 282699
3 5 A-4 McDonnell Douglas A-4 Skyhawk http://www.fighter-planes.com/info/a4-skyhawk.htm Fighter Jet Skyhawk FIGHTER 390290
4 6 A-5 North American A-5 Vigilante http://www.militaryfactory.com/aircraft/detail... Bomber Jet Vigilante BOMBER 10
5 7 A-6 Grumman A-6 Intruder http://www.militaryfactory.com/aircraft/detail... Attack Aircraft Intruder ATTACK 148372
6 8 A-7 LTV A-7 Corsair II http://www.militaryfactory.com/aircraft/detail... Attack Aircraft Corsair II ATTACK 171983
7 9 AC-119 Fairchild AC-119 Shadow or Stinger https://en.wikipedia.org/wiki/Fairchild_AC-119 Military Transport aircraft Shadow or Stinger TRANSPORT 81757
8 10 AC-123 Fairchild C-123 Provider http://www.warbirdalley.com/c123.htm Military Transport aircraft Provider TRANSPORT 3435
9 11 AC-130 Lockheed AC-130 Spectre http://fas.org/man/dod-101/sys/ac/ac-130.htm Fixed wing ground attack gunship Spectre ATTACK 76620
In [15]:
Air_Gloss_df.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 103 entries, 0 to 103
Data columns (total 8 columns):
 #   Column                Non-Null Count  Dtype 
---  ------                --------------  ----- 
 0   gloss_id              103 non-null    int64 
 1   validated_root        103 non-null    object
 2   aircraft_name         103 non-null    object
 3   website_link          103 non-null    object
 4   aircraft_type         103 non-null    object
 5   aircraft_shortname    92 non-null     object
 6   aircraft_application  102 non-null    object
 7   ac_mission_count      103 non-null    int64 
dtypes: int64(2), object(6)
memory usage: 7.2+ KB
In [16]:
Air_Gloss_df.isnull().sum()
Out[16]:
gloss_id                 0
validated_root           0
aircraft_name            0
website_link             0
aircraft_type            0
aircraft_shortname      11
aircraft_application     1
ac_mission_count         0
dtype: int64
In [17]:
# Calculate the percentage of null values in each column
(Air_Gloss_df.isnull().mean() * 100).round(2)
Out[17]:
gloss_id                 0.00
validated_root           0.00
aircraft_name            0.00
website_link             0.00
aircraft_type            0.00
aircraft_shortname      10.68
aircraft_application     0.97
ac_mission_count         0.00
dtype: float64
In [18]:
Wpn_Gloss_df.head(10)
Out[18]:
weapon_id weapontype weapontype_common_name weapon_class weapontype_desc weapon_count
0 1 100 GP General Purpose Bomb BOMB 100 lb general purpose 1
1 2 1000 G Megaboller flash powder bomb BOMB 1000 g BKS 2
2 3 1000LB GP M-65 An-M65 BOMB 1000 lb general purpose 12776
3 4 1000LB MK-83 Mark 83 bomb BOMB 1000 lb none guidence general purpose bomb 15522
4 5 1000LB SAP M59 AN-M59 BOMB 1000 lb semi-armor piercing bomb 454
5 6 100LB FR M-IA2 NaN BOMB NaN 11858
6 7 100LB GP M-30 AN-M30 BOMB 100 lb general purpose 4610
7 8 100LB M-28 NaN BOMB NaN 1639
8 9 100LB PWP M-47 M47 BOMB 100 lb chemical bomb 9970
9 10 105 HOWITZER AMMO Howitzer ammo GUN 105mm Howitzer ammo 51
In [19]:
Wpn_Gloss_df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 294 entries, 0 to 293
Data columns (total 6 columns):
 #   Column                  Non-Null Count  Dtype 
---  ------                  --------------  ----- 
 0   weapon_id               294 non-null    int64 
 1   weapontype              294 non-null    object
 2   weapontype_common_name  175 non-null    object
 3   weapon_class            294 non-null    object
 4   weapontype_desc         176 non-null    object
 5   weapon_count            294 non-null    int64 
dtypes: int64(2), object(4)
memory usage: 13.9+ KB
In [20]:
Wpn_Gloss_df.isnull().sum()
Out[20]:
weapon_id                   0
weapontype                  0
weapontype_common_name    119
weapon_class                0
weapontype_desc           118
weapon_count                0
dtype: int64
In [21]:
# Calculate the percentage of null values in each column
(Wpn_Gloss_df.isnull().mean() * 100).round(2)
Out[21]:
weapon_id                  0.00
weapontype                 0.00
weapontype_common_name    40.48
weapon_class               0.00
weapontype_desc           40.14
weapon_count               0.00
dtype: float64
In [22]:
# Convert msndate to datetime format and store as a new field
Bomb_Ops_df['msndatetime'] = pd.to_datetime(Bomb_Ops_df['msndate'], errors='coerce')
In [23]:
# Convert msndate to date format
Bomb_Ops_df['msndate'] = Bomb_Ops_df['msndatetime'].dt.strftime('%Y-%m-%d')
In [24]:
# Create additional time-related column - msnyear
Bomb_Ops_df['msnyear'] = Bomb_Ops_df['msndatetime'].dt.year
In [25]:
# Create additional time-related column - msnyearmonth
Bomb_Ops_df['msnyearmonth'] = Bomb_Ops_df['msndatetime'] + pd.offsets.MonthBegin(0)
In [26]:
# Create additional time-related column - msnmonthname
Bomb_Ops_df['msnmonthname'] = Bomb_Ops_df['msndatetime'].dt.strftime('%b')
In [27]:
# Harmonize the operationsupported to new field operation_grp

Bomb_Ops_df['operation_grp'] = Bomb_Ops_df['operationsupported'].str.split(" - |- ").str[0]

# Replace NaN with "UNNAMED"
Bomb_Ops_df['operation_grp'] = Bomb_Ops_df['operation_grp'].fillna("UNNAMED")

# Replace empty strings with "UNNAMED"
Bomb_Ops_df['operation_grp'] = Bomb_Ops_df['operation_grp'].replace("", "UNNAMED")
In [28]:
# Merge (Left Join) Bomb_Ops_df with Air_Gloss_df 
# on valid_aircraft_root in Bomb_Ops_df and validated_root in Air_Gloss_df

temp_df = pd.merge(Bomb_Ops_df, 
                   Air_Gloss_df, 
                   left_on='valid_aircraft_root', 
                   right_on='validated_root', 
                   how='left')
In [29]:
# Merge (Left Join) the above output with Wpn_Gloss_df on weapontype in Wpn_Gloss_df

bombings_df = pd.merge(temp_df, 
                       Wpn_Gloss_df, 
                       left_on='weapontype', 
                       right_on='weapontype', 
                       how='left')
In [30]:
# Check the data dimensions - number of rows and columns
print(f'Final dataframe: bombings_df, consists {bombings_df.shape[0]} rows, and {bombings_df.shape[1]} columns')
Final dataframe: bombings_df, consists 4670416 rows, and 65 columns
In [31]:
bombings_df.head(10)
Out[31]:
thor_data_viet_id countryflyingmission milservice msndate sourceid sourcerecord valid_aircraft_root takeofflocation tgt_latitude tgt_longitude tgttype numweaponsdelivered timeontarget weapontype weapontypeclass weapontypeweight aircraft_original aircraft_root airforcegroup airforcesqdn callsign flthours mfunc mfunc_desc missionid numofacft operationsupported periodofday unit tgtcloudcover tgtcontrol tgtcountry tgtid tgtorigcoords tgtorigcoordsformat tgtweather additionalinfo geozone id mfunc_desc_class numweaponsjettisoned numweaponsreturned releasealtitude releasefltspeed resultsbda timeofftarget weaponsloadedweight msndatetime msnyear msnyearmonth msnmonthname operation_grp gloss_id validated_root aircraft_name website_link aircraft_type aircraft_shortname aircraft_application ac_mission_count weapon_id weapontype_common_name weapon_class weapontype_desc weapon_count
0 351 UNITED STATES OF AMERICA USAF 1971-06-05 647464 SEADAB EC-47 TAN SON NHUT NaN NaN NaN 0 1005.0 NaN NaN 0 EC47 EC47 NaN NaN STEEL 5 70 34.0 RADIO DIRECT FINDER 2624 1 NaN D 360TEW NaN NaN CAMBODIA NaN NaN NaN NaN UNIT: 360TEW - CALLSIGN: STEEL 5 NaN 27135863 NONKINETIC -1 -1 NaN NaN NaN 1005.0 0 1971-06-05 1971 1971-07-01 Jun UNNAMED 43.0 EC-47 Douglas C-47 Skytrain https://en.wikipedia.org/wiki/Douglas_C-47_Sky... Military Transport aircraft Skytrain TRANSPORT 59034.0 NaN NaN NaN NaN NaN
1 2 UNITED STATES OF AMERICA USAF 1972-12-26 642778 SEADAB EC-47 NAKHON PHANOM NaN NaN NaN 0 530.0 NaN NaN 0 EC47 EC47 NaN NaN BARON 6 0 74.0 EXTRACTION (GPES) 2909 1 NaN D 361TEW NaN NaN SOUTH VIETNAM NaN NaN NaN NaN UNIT: 361TEW - CALLSIGN: BARON 6 NaN 27131177 NONKINETIC -1 -1 NaN NaN NaN 530.0 0 1972-12-26 1972 1973-01-01 Dec UNNAMED 43.0 EC-47 Douglas C-47 Skytrain https://en.wikipedia.org/wiki/Douglas_C-47_Sky... Military Transport aircraft Skytrain TRANSPORT 59034.0 NaN NaN NaN NaN NaN
2 3 UNITED STATES OF AMERICA USAF 1973-07-28 642779 SEADAB RF-4 UDORN AB NaN NaN NaN 0 730.0 NaN NaN 0 RF4 RF4 NaN NaN ATLANTA 30 18.0 VISUAL RECCE 3059 1 NaN D 432TRW NaN NaN LAOS NaN NaN NaN NaN UNIT: 432TRW - CALLSIGN: ATLANTA NaN 27131178 NONKINETIC -1 -1 NaN NaN NaN 730.0 0 1973-07-28 1973 1973-08-01 Jul UNNAMED 85.0 RF-4 McDonnell F-4 Phantom II https://en.wikipedia.org/wiki/McDonnell_Dougla... Fighter bomber jet Phantom II FIGHTER, BOMBER 243259.0 NaN NaN NaN NaN NaN
3 4 UNITED STATES OF AMERICA USAF 1970-02-02 642780 SEADAB A-1 NAKHON PHANOM 16.902500 106.014166 TRUCKS 2 1415.0 BLU27 FIRE BOMB (750) NaN 750 A1 A1 NaN NaN FF32 68 1.0 STRIKE 1047 2 NaN N 56SOW NaN NaN LAOS NaN 165409N1060051E DDMMSSN DDDMMSSE NaN UNIT: 56SOW - CALLSIGN: FF32 XE 27131179 KINETIC -1 -1 NaN NaN SECONDARY FIRE 1415.0 17400 1970-02-02 1970 1970-03-01 Feb UNNAMED 1.0 A-1 Douglas A-1 Skyraider http://www.navalaviationmuseum.org/attractions... Fighter Jet Skyraider FIGHTER 373265.0 76.0 BLU-27/B BOMB "(750 lb) class fire bombs was very similar to... 8633.0
4 5 VIETNAM (SOUTH) VNAF 1970-10-08 642781 SEADAB A-37 DANANG 14.945555 108.257222 BASE CAMP AREA 0 1240.0 NaN NaN 0 A37 A37 NaN NaN TIGER 41 28 5.0 CLOSE AIR SUPPORT B542 2 NaN D 516FS NaN NaN SOUTH VIETNAM NaN 145644N1081526E DDMMSSN DDDMMSSE NaN UNIT: 516FS - CALLSIGN: TIGER 41 ZB 27131180 KINETIC -1 -1 NaN NaN RNO WEATHER 1240.0 0 1970-10-08 1970 1970-11-01 Oct UNNAMED 4.0 A-37 Cessna A-37 Dragonfly http://www.militaryfactory.com/aircraft/detail... Light ground-attack aircraft Dragonfly ATTACK 282699.0 NaN NaN NaN NaN NaN
5 6 UNITED STATES OF AMERICA USAF 1970-11-25 642782 SEADAB F-4 UBON AB 19.602222 103.597222 AAA\37MM CR MORE 6 650.0 MK 82 GP BOMB (500) LD NaN 500 F4 F4 NaN NaN JASPER 57 1.0 STRIKE 1407 2 NaN D 8TFW NaN NaN LAOS NaN 193608N1033550E DDMMSSN DDDMMSSE NaN UNIT: 8TFW - CALLSIGN: JASPER UG 27131181 KINETIC -1 -1 NaN NaN DAMAGED 650.0 31860 1970-11-25 1970 1970-12-01 Nov UNNAMED 54.0 F-4 McDonnell Douglas F-4 Phantom II https://en.wikipedia.org/wiki/McDonnell_Dougla... Fighter Jet Bomber Phantom II FIGHTER, BOMBER 957427.0 205.0 MK 82 BOMB "free-fall, nonguided general purpose (GP) 500... 62921.0
6 7 UNITED STATES OF AMERICA USN 1972-03-08 642783 SEADAB A-4 TONKIN GULF 14.573611 106.689722 TRUCKS 0 1005.0 NaN NaN 0 A4 A4 NaN NaN CD H 16 1.0 STRIKE 9064 2 NaN D 775CTG NaN NaN LAOS NaN 143425N1064123E DDMMSSN DDDMMSSE NaN UNIT: 775CTG - CALLSIGN: CD H XB 27131182 KINETIC -1 -1 NaN NaN RNO NONVISUAL 1005.0 0 1972-03-08 1972 1972-04-01 Mar UNNAMED 5.0 A-4 McDonnell Douglas A-4 Skyhawk http://www.fighter-planes.com/info/a4-skyhawk.htm Fighter Jet Skyhawk FIGHTER 390290.0 NaN NaN NaN NaN NaN
7 8 UNITED STATES OF AMERICA USAF 1971-12-27 642784 SEADAB F-4 UDORN AB NaN NaN NaN 0 0.0 NaN NaN 0 F4 F4 NaN NaN FALCON80 0 NaN NaN 7661 2 NaN NaN 432TRW NaN NaN LAOS NaN NaN NaN NaN UNIT: 432TRW - CALLSIGN: FALCON80 NaN 27131183 NONKINETIC -1 -1 NaN NaN NaN 0.0 0 1971-12-27 1971 1972-01-01 Dec UNNAMED 54.0 F-4 McDonnell Douglas F-4 Phantom II https://en.wikipedia.org/wiki/McDonnell_Dougla... Fighter Jet Bomber Phantom II FIGHTER, BOMBER 957427.0 NaN NaN NaN NaN NaN
8 9 UNITED STATES OF AMERICA USN 1972-05-24 642785 SEADAB A-7 TONKIN GULF NaN NaN NaN 0 0.0 NaN NaN 0 A7 A7 NaN NaN CD CS 0 NaN NaN 9205 4 NaN NaN 776CTG NaN NaN NORTH VIETNAM NaN NaN NaN NaN UNIT: 776CTG - CALLSIGN: CD CS NaN 27131184 NONKINETIC -1 -1 NaN NaN NaN 0.0 0 1972-05-24 1972 1972-06-01 May UNNAMED 8.0 A-7 LTV A-7 Corsair II http://www.militaryfactory.com/aircraft/detail... Attack Aircraft Corsair II ATTACK 171983.0 NaN NaN NaN NaN NaN
9 10 UNITED STATES OF AMERICA USAF 1972-09-12 642786 SEADAB EC-47 TAN SON NHUT NaN NaN NaN 0 710.0 NaN NaN 0 EC47 EC47 NaN NaN LEGMAN59 70 34.0 RADIO DIRECT FINDER 2618 1 NaN D 360TEW NaN NaN SOUTH VIETNAM NaN NaN NaN NaN UNIT: 360TEW - CALLSIGN: LEGMAN59 NaN 27131185 NONKINETIC -1 -1 NaN NaN NaN 710.0 0 1972-09-12 1972 1972-10-01 Sep UNNAMED 43.0 EC-47 Douglas C-47 Skytrain https://en.wikipedia.org/wiki/Douglas_C-47_Sky... Military Transport aircraft Skytrain TRANSPORT 59034.0 NaN NaN NaN NaN NaN
In [32]:
bombings_df.shape
Out[32]:
(4670416, 65)
In [33]:
bombings_df.isnull().sum()
Out[33]:
thor_data_viet_id               0
countryflyingmission         3615
milservice                   3249
msndate                         0
sourceid                        0
sourcerecord                    0
valid_aircraft_root             0
takeofflocation              4971
tgt_latitude              1130131
tgt_longitude             1130131
tgttype                   1830425
numweaponsdelivered             0
timeontarget                26429
weapontype                2403497
weapontypeclass           4670416
weapontypeweight                0
aircraft_original             482
aircraft_root                 482
airforcegroup             4667508
airforcesqdn              4667634
callsign                  3300321
flthours                        0
mfunc                      101111
mfunc_desc                 104722
missionid                   15670
numofacft                       0
operationsupported        1920049
periodofday                199764
unit                          493
tgtcloudcover             2440248
tgtcontrol                2150583
tgtcountry                 216774
tgtid                     4670381
tgtorigcoords             1068209
tgtorigcoordsformat       1092870
tgtweather                2592711
additionalinfo                  0
geozone                   1168052
id                              0
mfunc_desc_class                0
numweaponsjettisoned            0
numweaponsreturned              0
releasealtitude           4667038
releasefltspeed           4668727
resultsbda                4385020
timeofftarget               26429
weaponsloadedweight             0
msndatetime                     0
msnyear                         0
msnyearmonth                    0
msnmonthname                    0
operation_grp                   0
gloss_id                    52734
validated_root              52734
aircraft_name               52734
website_link                52734
aircraft_type               52734
aircraft_shortname         105435
aircraft_application        52757
ac_mission_count            52734
weapon_id                 2403700
weapontype_common_name    3206777
weapon_class              2403700
weapontype_desc           3146824
weapon_count              2403700
dtype: int64
In [34]:
# Calculate the percentage of null values in each column
(bombings_df.isnull().mean() * 100).round(2)
Out[34]:
thor_data_viet_id           0.00
countryflyingmission        0.08
milservice                  0.07
msndate                     0.00
sourceid                    0.00
sourcerecord                0.00
valid_aircraft_root         0.00
takeofflocation             0.11
tgt_latitude               24.20
tgt_longitude              24.20
tgttype                    39.19
numweaponsdelivered         0.00
timeontarget                0.57
weapontype                 51.46
weapontypeclass           100.00
weapontypeweight            0.00
aircraft_original           0.01
aircraft_root               0.01
airforcegroup              99.94
airforcesqdn               99.94
callsign                   70.66
flthours                    0.00
mfunc                       2.16
mfunc_desc                  2.24
missionid                   0.34
numofacft                   0.00
operationsupported         41.11
periodofday                 4.28
unit                        0.01
tgtcloudcover              52.25
tgtcontrol                 46.05
tgtcountry                  4.64
tgtid                     100.00
tgtorigcoords              22.87
tgtorigcoordsformat        23.40
tgtweather                 55.51
additionalinfo              0.00
geozone                    25.01
id                          0.00
mfunc_desc_class            0.00
numweaponsjettisoned        0.00
numweaponsreturned          0.00
releasealtitude            99.93
releasefltspeed            99.96
resultsbda                 93.89
timeofftarget               0.57
weaponsloadedweight         0.00
msndatetime                 0.00
msnyear                     0.00
msnyearmonth                0.00
msnmonthname                0.00
operation_grp               0.00
gloss_id                    1.13
validated_root              1.13
aircraft_name               1.13
website_link                1.13
aircraft_type               1.13
aircraft_shortname          2.26
aircraft_application        1.13
ac_mission_count            1.13
weapon_id                  51.47
weapontype_common_name     68.66
weapon_class               51.47
weapontype_desc            67.38
weapon_count               51.47
dtype: float64

6. Answers via Visualizations¶

Delving into the complex trajectory of the Vietnam War, this section employs visualizations to unravel critical facets of the conflict.

6.1 Geographical Distribution: How is the bombing activity distributed geographically across Vietnam?¶

Explore the spatial dynamics of bombing activities across Vietnam. Witness the shifting patterns over the war period via animated scatter map plot on top of Vietnam and neighboring regions, revealing nuanced insights into the strategic choices made.

With the help of animation in this chart, we are covering the depth question: 6.1.1 Geographical Distribution: How does the geographical distribution change over the war period?

In [35]:
# Creating the subset of colulmns for map visual
subset_columns = ['msndate', 'tgt_latitude', 'tgt_longitude']
map_data = bombings_df[subset_columns]
map_data = map_data.sort_values(["msndate"])
map_data['size'] = 1
map_data = map_data.iloc[:50000]

# Map plot using Plotly
fig = px.scatter_mapbox(map_data, 
                        lat="tgt_latitude", 
                        lon="tgt_longitude", 
                        animation_frame="msndate",
                        animation_group="tgt_longitude", 
                        color_continuous_scale=px.colors.sequential.Hot, 
                        color_discrete_sequence=["red"],
                        zoom = 4,
                        size = 'size',
                        size_max = 3,
                        labels={'msndate':'Mission Date'})
fig.update_layout(mapbox_style="carto-positron")
fig.update_layout(margin={"r":0,"t":0,"l":0,"b":0})
fig.update_layout(autosize=False, width=800, height=600)
print(color.BOLD + 'Geographical Distribution of Strikes:' + color.END)
print('Use Play button at bottom to visualize the strikes. Reload the page if map does not respond.')
fig.show()
Geographical Distribution of Strikes:
Use Play button at bottom to visualize the strikes. Reload the page if map does not respond.

Narrative:¶

As we begin our journey, we dive into the geography of Vietnam. The animated scatter map paints a vivid picture of the intense bombing campaigns, revealing how different regions bore the brunt of the conflict. Watch as the red dots unfold, capturing the ebb and flow of strategic decisions across the Vietnamese landscape.

Design Choices:¶

Geographical Distribution of Strikes

  • Visualization Type: Animated scatter map.
  • Color Encoding: Red dots represent the area where bombs are dropped.
  • Animation Frame: Mission dates for temporal progression.
  • Mark Size: Default size for scatter points.
  • Label Choices: Minimal labels, focusing on geographical elements.
  • Color Scale: Hot color scale for intensity.
  • Map Style: Carto-positron for clarity.
  • Size: 800X600 for an optimal balance between clarity and detail.
  • Play/Pause and Stop Button

6.2 Temporal Patterns: What is the timeline of bombing activities throughout the Vietnam War?¶

Uncover the bombing intensity patterns through a chronological lens. Identify spikes, lulls, and correlations with historical events, offering a temporal perspective on the war's progression.

In [36]:
strikes_per_year = Bomb_Ops_df.groupby(['msnyear']).agg({'thor_data_viet_id':'count',
                                                         'id':'count','sourceid':'count'}).reset_index()
In [37]:
strikes_per_year
Out[37]:
msnyear thor_data_viet_id id sourceid
0 1965 70475 70475 70475
1 1966 412202 412202 412202
2 1967 593838 593838 593838
3 1968 778803 778803 778803
4 1969 539349 539349 539349
5 1970 806141 806141 806141
6 1971 485192 485192 485192
7 1972 603243 603243 603243
8 1973 248143 248143 248143
9 1974 104702 104702 104702
10 1975 28328 28328 28328
In [38]:
# Create a bar graph using Plotly Express
fig = px.bar(strikes_per_year, 
             x='msnyear', 
             y='thor_data_viet_id', 
             title='Bombing Strikes each Year', 
             labels={'Value': 'Count'})

fig.update_xaxes(tickmode='linear')
fig.update_layout(xaxis_title='Year', yaxis_title='Number of Strikes')

# Annotate each bar with its count
for i, row in strikes_per_year.iterrows():
    fig.add_annotation(
        x=row['msnyear'],
        y=row['thor_data_viet_id'],
        text=str(row['thor_data_viet_id']),
        showarrow=True,
        arrowhead=2,
        arrowsize=1,
        arrowwidth=2,
        arrowcolor="#636363",
        ax=0,
        ay=-40,
    )

# Show the plot
fig.show()

Narrative:¶

Moving through time, our visualizations provide a unique lens into the temporal patterns of bombing activities. The bar graphs present a bird's-eye view of the overall bombing trends. The line charts delve deeper, unraveling the stories behind each spike and lull, drawing connections to key historical events that shaped the trajectory of the war.

Design Choices:¶

Bar Graph for Yearly Bombing Strikes

  • Visualization Type: Bar graph.
  • Color Encoding: No color encoding, simple representation.
  • Annotations: Label on each bar for precise counts.
  • Axes: Linear tick mode for clarity.

6.2.1 Temporal Patterns: Are there notable spikes or lulls in bombing intensity during specific periods by countries involved?¶

Uncover patterns that highlight distinct periods of heightened or reduced activity, focusing on the specific countries involved. This exploration aims to provide a detailed understanding of how bombing intensity fluctuated over time.

In [39]:
# Group the data by date and country and count the number of unique mission IDs for each date and country

missions_by_date_country = bombings_df.groupby(['msnyear', 
                                                'msndate', 
                                                'countryflyingmission'])['thor_data_viet_id'].nunique().reset_index()
missions_by_date_country.columns = ['year', 'date', 'country', 'Number of Missions']
In [40]:
# Create a custom color mapping
color_scale = px.colors.qualitative.Set1  # You can choose a different color scale
country_colors = dict(zip(missions_by_date_country['country'].unique(), color_scale))
In [41]:
# Create a line chart using Plotly with multiple lines for each country
fig = px.line(missions_by_date_country, x='date', y='Number of Missions', color='country',
              title='Number of Missions over Time by Country',
              labels={'date': 'Date', 'Number of Missions': 'Number of Missions'},
              color_discrete_map=country_colors)

# Update legend position
fig.update_layout(legend=dict(
    yanchor="top",
    y=1,
    xanchor="right",
    x=1
))

# Show the chart
fig.show()

Narrative:¶

Zooming in on specific countries, our visualizations reveal the heartbeat of the war. The interactive line charts allow us to dissect notable spikes and lulls, connecting each rhythm to the unique strategies employed by different nations during critical junctures of the Vietnam War.

Design Choices:¶

Line Charts for Bombs Dropped Over Time (By Country)

  • Visualization Type: Line chart.
  • Color Encoding: Different colors for each country.
  • Legend: Positioned at the top right for clear association.
  • Labels: Clear labels for the axes and title.

6.2.2 Temporal Patterns: How allied countries has participated in the bombing activities over the period?¶

Explore the collaborative dynamics of allied nations in the context of bombing activities throughout the Vietnam War. This analysis involves isolating the contributions of allied countries, excluding the USA, to reveal their individual patterns of involvement over the war period.

In [42]:
# Filter out USA
missions_by_date_country_allies = missions_by_date_country[missions_by_date_country['country'] != 'UNITED STATES OF AMERICA']

# Create a line chart using Plotly with multiple lines for each country
fig_allies = px.line(missions_by_date_country_allies, x='date', y='Number of Missions', color='country',
                     title='Number of Missions over Time by US Allies',
                     labels={'date': 'Date', 'Number of Missions': 'Number of Missions'},
                     color_discrete_map=country_colors)  # Use the same color mapping

# Update legend position
fig_allies.update_layout(legend=dict(
    yanchor="top",
    y=1,
    xanchor="right",
    x=1
))

# Show the chart for US Allies
fig_allies.show()

Narrative:¶

Allies played a crucial role in shaping the course of the conflict. By isolating their contributions, our line chart offers a nuanced view of how different nations, working in tandem, influenced the ebb and flow of bombing campaigns. Witness the collaboration and individual strategies of these allied forces.

Design Choices:¶

Line Chart for Allied Countries (Excluding USA)

  • Visualization Type: Line chart.
  • Color Encoding: Different colors for each ally.
  • Legend: Positioned at the top right for clear association.
  • Labels: Clear labels for the axes and title.

6.2.3: Temporal Patterns: How do temporal patterns correlate with major historical events?¶

Examine the intricate interplay between temporal patterns in bombing activities and major historical events during the Vietnam War. Through an interactive line chart encompassing all involved countries, this exploration allows users to discern correlations between spikes or lulls in bombing intensity and significant historical occurrences.

The addition of interactive elements, such as a dropdown for selecting specific years and the ability to click on country legends for individual focus, enhances the depth of analysis, enabling a more personalized exploration of the data's temporal narrative.

In [43]:
# Create traces for each country
traces = []
for country in missions_by_date_country['country'].unique():
    trace = go.Scatter(
        x=missions_by_date_country[missions_by_date_country['country'] == country]['date'],
        y=missions_by_date_country[missions_by_date_country['country'] == country]['Number of Missions'],
        mode='lines',
        name=country,
        line=dict(color=country_colors[country])
    )
    traces.append(trace)

# Create the layout
layout = go.Layout(
    title='Number of Missions over Time by Country',
    xaxis=dict(title='Date'),
    yaxis=dict(title='Number of Missions'),
    legend=dict(title=dict(text='Country')),  # Add a title to the legend
    annotations=[
        dict(
            text='Select Year:',  # Name for the dropdown
            x=0.89,  # Adjust the position of the dropdown name
            xref='paper',  # Set the x coordinate to be a fraction of the entire plot
            y=1.08,  # Adjust the position of the dropdown name
            yref='paper',  # Set the y coordinate to be a fraction of the entire plot
            showarrow=False,
        )
    ],
    updatemenus=[
        dict(
            type='dropdown',
            showactive=False,
            buttons=[
                dict(label='All',
                     method='relayout',
                     args=['xaxis.range', [missions_by_date_country['date'].min(), missions_by_date_country['date'].max()]]),
                *[
                    dict(label=str(year),
                         method='relayout',
                         args=[{'xaxis.range': [missions_by_date_country[missions_by_date_country['year'] == year]['date'].min(),
                                                missions_by_date_country[missions_by_date_country['year'] == year]['date'].max()]}])
                    for year in range(1965, 1976)
                ]
            ],
            direction="down",
            x=0.89,  # Adjust the position of the dropdown menu
            xanchor='left',  # Set the anchor point for the x position
            y=1.1,  # Adjust the position of the dropdown menu
            yanchor='top'  # Set the anchor point for the y position
        ),
    ]
)

# Create the figure
fig = go.Figure(data=traces, layout=layout)

print('Interactive Line Chart: Use "Select Year" dropdown on right top corner.')
print('Interactive Line Chart: Click on Country legend to filter that country in plot.')

# Show the chart
fig.show()
Interactive Line Chart: Use "Select Year" dropdown on right top corner.
Interactive Line Chart: Click on Country legend to filter that country in plot.

Narrative:¶

The interweaving of temporal patterns with historical events is a captivating aspect of our exploration. The interactive line chart provides a dynamic canvas where users can select specific years and countries, unveiling the complex dance between historical shifts and the strategic choices reflected in the intensity of bombing campaigns.

Design Choices:¶

Interactive Line Chart for Bombs Dropped Over Time (By Country)

  • Visualization Type: Interactive Line chart.
  • Color Encoding: Different colors for each country.
  • Interaction: Dropdown menu for selecting specific years, enabling a focused exploration of temporal patterns.
  • Interaction Enhancement: Users can click on country legends to filter and emphasize individual countries for detailed analysis.
  • Legend: Positioned at the top right for clear association, facilitating user understanding.
  • User Guidance: Include tooltips or a brief guide to inform users about the interactive features, ensuring a user-friendly experience.

6.3 Tonnage: What is the tonnage of bombs dropped throughout the war?¶

Visualize the sheer magnitude of bombs dropped, discerning periods of escalation or de-escalation. Connect tonnage data with overarching military strategies, illuminating the strategic shifts in the conflict.

In [44]:
total_tonnage_by_target = bombings_df[['weapontypeweight','numweaponsdelivered','tgtcountry']]
In [45]:
total_tonnage_by_target['Total Tonnage'] = total_tonnage_by_target['weapontypeweight'] * total_tonnage_by_target['numweaponsdelivered']
In [46]:
tonnage_bombings_at_target = total_tonnage_by_target.groupby(['tgtcountry']).agg({'Total Tonnage':'sum'}).reset_index()
In [47]:
tonnage_bombings_at_target
Out[47]:
tgtcountry Total Tonnage
0 CAMBODIA 1406510926
1 LAOS 5383460570
2 NORTH VIETNAM 1479669079
3 PHILLIPINES 0
4 SOUTH VIETNAM 7996763648
5 THAILAND 2195963
6 UNKNOWN 79343
7 WESTPAC WATERS 22651
In [48]:
# Create a bubble chart using Plotly Express
fig = px.scatter(tonnage_bombings_at_target, 
                 x='tgtcountry', 
                 y='Total Tonnage', 
                 size=tonnage_bombings_at_target['Total Tonnage'],
                 title='Bombing Tonnage (Bubble Chart)',
                 labels={'Total Tonnage': 'Tonnage'},
                 size_max=145,
                 color_discrete_sequence=['red'])  # Adjust the size of bubbles as needed

fig.update_xaxes(tickmode='linear')
fig.update_layout(xaxis_title='Year', yaxis_title='Tonnage')

# Add annotations
for i, row in tonnage_bombings_at_target.iterrows():
    fig.add_annotation(
        x=row['tgtcountry'],
        y=row['Total Tonnage'],
        text=f"{row['Total Tonnage']/1000000000:.2f} B",  # Format tonnage using scientific notation
        showarrow=False
    )

# Show the plot
fig.show()

Narrative:¶

The weight of the war is brought to light as we delve into the tonnage of bombs dropped. The bubble chart provides a compelling visualization, allowing us to grasp the scale of military operations. Each bubble, representing tonnage, unveils strategic choices and highlights pivotal moments where the conflict's intensity escalated or de-escalated.

Design Choices:¶

Bubble Chart for Tonnage at Each Target Country

  • Visualization Type: Bubble chart.
  • Color Encoding: Red bubbles for clarity.
  • Size Encoding: Tonnage represented by bubble size.
  • Annotations: Display tonnage values on each bubble.

6.3.1 Tonnage: Explore the Tonnage of Bombs dropped over the years.¶

Dive into the temporal dynamics of tonnage, examining how the overall tonnage of bombs dropped evolved throughout the Vietnam War. Uncover patterns, identifying periods of notable escalation or de-escalation, and consider the correlation with key historical events or shifts in military strategies

In [49]:
total_tonnage_by_year = bombings_df[['weapontypeweight','numweaponsdelivered','msnyear']]
In [50]:
total_tonnage_by_year['total_tonnage'] = total_tonnage_by_year['weapontypeweight'] * total_tonnage_by_year['numweaponsdelivered']
In [51]:
tonnage_bombings = total_tonnage_by_year.groupby(['msnyear']).agg({'total_tonnage':'sum'}).reset_index()
In [52]:
tonnage_bombings
Out[52]:
msnyear total_tonnage
0 1965 194602627
1 1966 791268177
2 1967 1416718000
3 1968 2393304314
4 1969 2194342707
5 1970 2924754828
6 1971 2240960399
7 1972 3012978078
8 1973 1176865061
9 1974 63390694
10 1975 25206037
In [53]:
# Create a bar graph using Plotly Express
fig = px.bar(tonnage_bombings, 
             x='msnyear', 
             y='total_tonnage', 
             title='Bombing Tonnage each Year', 
             labels={'Value': 'Count'})

fig.update_xaxes(tickmode='linear')
fig.update_layout(xaxis_title='Year', yaxis_title='Tonnage')

# Annotate each bar with its tonnage
for i, row in tonnage_bombings.iterrows():
    fig.add_annotation(
        x=row['msnyear'],
        y=row['total_tonnage'],
        text=f"{row['total_tonnage']}",
        showarrow=True,
        arrowhead=2,
        arrowsize=1,
        arrowwidth=2,
        arrowcolor="#636363",
        ax=0,
        ay=-40,
    )

# Show the plot
fig.show()

Narrative:¶

Our journey through tonnage continues with a detailed examination of its temporal evolution. The bar graph serves as a timeline, showcasing how the weight of bombings shifted across the war years. Explore key moments of escalation and de-escalation, connecting these fluctuations with historical events that shaped the strategic landscape.

Design Choices:¶

Bar Graph for Yearly Tonnage of Bombs Dropped

  • Visualization Type: Bar graph.
  • Annotations: Label on each bar for precise tonnage counts.
  • Axes: Linear tick mode for clarity.

6.3.2 Tonnage: Explore the Tonnage of Bombs dropped on each Target Country over the years.¶

Shift the focus to the geographical impact by exploring tonnage on a country level. Visualize the distribution of tonnage across target countries, revealing variations and intensity changes. Analyze how tonnage correlates with the strategic importance of each target country throughout the war.

The play and pause functionality allows for controlled observation, unraveling the shifting dynamics of tonnage over the course of the war.

In [54]:
total_tonnage_on_country_year = bombings_df[['tgtcountry', 
                                             'weapontypeweight',
                                             'numweaponsdelivered',
                                             'msnyear']]
In [55]:
total_tonnage_on_country_year['total_tonnage'] = total_tonnage_on_country_year['weapontypeweight'] * total_tonnage_on_country_year['numweaponsdelivered']
In [56]:
total_tonnage_on_country_year = total_tonnage_on_country_year.groupby(['msnyear', 'tgtcountry']).agg({'total_tonnage':'sum'}).reset_index()
In [57]:
# Create a pivot table to fill in missing values with 0
pivot_table = total_tonnage_on_country_year.pivot_table(index='msnyear', columns='tgtcountry', values='total_tonnage', fill_value=0).reset_index()

# If needed, flatten the DataFrame
pivot_table = pivot_table.melt(id_vars='msnyear', var_name='tgtcountry', value_name='total_tonnage')
In [58]:
# Sort the pivot table by 'msnyear' and 'total_tonnage'
pivot_table = pivot_table.sort_values(by=['msnyear', 'total_tonnage'], ascending=[True, False])
In [59]:
my_raceplot = barplot(pivot_table, 
                      item_column='tgtcountry', 
                      value_column='total_tonnage', 
                      time_column='msnyear',
                      top_entries = 10)

print('Animated Map: Use Play button at bottom to visualize with animation.')

my_raceplot.plot(title = 'Total Tonnage Bombing on each Target per Year',
                 item_label='Target Country', 
                 value_label='Total tonnage', 
                 time_label = 'Year: ',
                 frame_duration=2000)
Animated Map: Use Play button at bottom to visualize with animation.

Narrative:¶

Our exploration of tonnage takes us to the heart of targeted regions. The interactive racing bar chart allows us to witness the ebb and flow of tonnage over each country, uncovering the specific impact on different regions. With play and pause functionality, observe how strategic priorities shaped the distribution of tonnage across the geopolitical map.

Design Choices:¶

Interactive Racing Bar Chart for Tonnage Over Time (By Target Country)

  • Visualization Type: Animated racing bar chart.
  • Animation: Play and pause button for user control.
  • Color Encoding: Different colors for each target country.
  • X-axis: Represents time (years).
  • Y-axis: Represents tonnage.

6.3.3 Tonnage: Explore the Tonnage of Bombs dropped on each Target Type over the years.¶

Delve into the specific types of targets by exploring tonnage variations over different target types. Uncover patterns and trends in tonnage concerning various target categories, shedding light on the priorities and strategies in bombing campaigns over the war's duration.

The play and pause feature offers a granular exploration of how tonnage levels evolved over time, shedding light on the strategic preferences in bomb targeting.

In [60]:
total_tonnage_by_target_type = bombings_df[['weapontypeweight','numweaponsdelivered','tgttype','msnyear']]
In [61]:
total_tonnage_by_target_type['total_tonnage'] = total_tonnage_by_target_type['weapontypeweight'] * total_tonnage_by_target_type['numweaponsdelivered']
In [62]:
total_tonnage_by_target_type = total_tonnage_by_target_type.groupby(['msnyear', 'tgttype']).agg({'total_tonnage':'sum'}).reset_index()
In [63]:
# Create a pivot table to fill in missing values with 0
pivot_table0 = total_tonnage_by_target_type.pivot_table(index='msnyear', 
                                                        columns='tgttype', 
                                                        values='total_tonnage', 
                                                        fill_value=0).reset_index()

# If needed, flatten the DataFrame
pivot_table0 = pivot_table0.melt(id_vars='msnyear', var_name='tgttype', value_name='total_tonnage')
In [64]:
my_raceplot = barplot(pivot_table0, 
                      item_column='tgttype', 
                      value_column='total_tonnage', 
                      time_column='msnyear',
                      top_entries = 10)

print('Animated Map: Use Play button at bottom to visualize with animation.')

my_raceplot.plot(title = 'Total Tonnage Bombing on Top-10 Target Type per Year',
                 item_label='Target Type', 
                 value_label='Total tonnage', 
                 time_label = 'Year: ',
                 frame_duration=2000)
Animated Map: Use Play button at bottom to visualize with animation.

Narrative:¶

Zooming in further, we scrutinize the specific targets of the bombings. The interactive racing bar chart unveils tonnage variations across different target types, providing insights into the priorities and strategies that shaped military operations. Engage with the play and pause feature for a granular exploration of how tonnage levels evolved over time.

Design Choices:¶

Interactive Racing Bar Chart for Tonnage Over Time (By Target Type)

  • Visualization Type: Animated racing bar chart.
  • Animation: Play and pause button for user control.
  • Color Encoding: Different colors for each target type.
  • X-axis: Represents time (years).
  • Y-axis: Represents tonnage.

6.4 Military Services: Which military services were actively involved in the bombings?¶

Navigate the landscape of military involvement, identifying trends in operations conducted by different services. Trace the evolution of participation over time, providing a comprehensive view of military contributions.

In [65]:
bomb_ops_by_service = bombings_df['milservice'].value_counts(dropna=False).reset_index()
In [66]:
bomb_ops_by_service = bomb_ops_by_service.rename(columns={'index': 'milservice', 'milservice': 'count'})
In [67]:
bomb_ops_by_service = bomb_ops_by_service.loc[bomb_ops_by_service.milservice.isin(['USAF','USN','VNAF','USMC','RLAF','KAF','RAAF'])]
In [68]:
bomb_ops_by_service
Out[68]:
milservice count
0 USAF 2813692
1 USN 694186
2 VNAF 634717
3 USMC 453996
4 RLAF 32779
5 KAF 24470
6 RAAF 12714
In [69]:
# Create a bar graph using Plotly Express
fig = px.bar(bomb_ops_by_service, 
             x='milservice', 
             y='count', 
             title='Number of Attacks by Military Forces', 
             labels={'Value': 'Count'})

fig.update_xaxes(tickmode='linear')
fig.update_layout(xaxis_title='Military Force', yaxis_title='Number of attacks')

# Add annotations with arrows
for i, row in bomb_ops_by_service.iterrows():
    fig.add_annotation(
        x=row['milservice'],
        y=row['count'],
        text=f"{row['count']}",
        showarrow=True,
        arrowhead=2,
        arrowsize=1,
        arrowwidth=2,
        arrowcolor="#636363",
        ax=0,
        ay=-40,
    )

# Show the plot
fig.show()

Narrative:¶

The spotlight now turns to the actors behind the operations. The bar chart offers a straightforward view of the military services involved, allowing us to identify the major contributors. As we move forward, we'll explore how the involvement of these military services evolved over the course of the war.

Design Choices:¶

Bar Chart for Number of Attacks by Military Forces

  • Visualization Type: Bar graph.
  • Color Encoding: No color encoding, simple representation.
  • Annotations: Label on each bar for precise counts.
  • Axes: Linear tick mode for clarity.

6.4.1 Military Services: How does the involvement of different military services evolve over time?¶

Embark on a dynamic exploration of military service involvement with a racing bar chart. This animated visualization unveils the changing landscape of military contributions, allowing for a temporal analysis of how different services actively participated in the strikes throughout the Vietnam War.

In [70]:
bomb_ops_by_service_by_year = bombings_df[['msnyear', 'milservice']].value_counts(dropna=False).reset_index()
In [71]:
bomb_ops_by_service_by_year = bomb_ops_by_service_by_year.rename(columns={0: 'mil_count'})
In [72]:
# Create a pivot table to fill in missing values with 0
pivot_table1 = bomb_ops_by_service_by_year.pivot_table(index='msnyear', columns='milservice', values='mil_count', fill_value=0).reset_index()
In [73]:
# If needed, flatten the DataFrame
pivot_table1 = pivot_table1.melt(id_vars='msnyear', var_name='milservice', value_name='milcount')
In [74]:
# Sort the pivot table by 'msnyear' and 'total_tonnage'
pivot_table1 = pivot_table1.sort_values(by=['msnyear', 'milcount'], ascending=[True, False])
In [75]:
my_raceplot = barplot(pivot_table1, 
                      item_column='milservice', 
                      value_column='milcount', 
                      time_column='msnyear',
                      top_entries = 10)

print('Animated Map: Use Play button at bottom to visualize with animation.')

my_raceplot.plot(title = 'Total Military Strikes by each Military Services per Year',
                 item_label='Military Service', 
                 value_label='Strikes', 
                 time_label = 'Year: ',
                 frame_duration=2000)
Animated Map: Use Play button at bottom to visualize with animation.

Narrative:¶

The story of military services comes alive through the racing bar chart. Watch as different services come to the forefront, reflecting the evolving strategies and priorities during different phases of the war. Engage with the play-pause button to dissect the temporal evolution of military service involvement.

Design Choices:¶

Racing Bar Chart for Evolution of Military Service Involvement Over Time

  • Visualization Type: Racing bar chart.
  • Color Encoding: Different colors for each military service.
  • Animation: Playful and informative racing animation for temporal evolution.
  • Labels: Clear labels for the axes and title.
  • Play-Pause Button: Include a user-friendly play-pause button for interactive engagement.

6.4.2 Military Services: Explore how military services of each country attacked different target countries in war.¶

Delve into the intricate web of military strategies using an interactive sunburst plot. This visualization enables users to dissect how the military services of each country targeted specific nations during the Vietnam War. The interactive elements provide a granular exploration, allowing for a comprehensive understanding of cross-border military operations.

In [76]:
sunburst_df = bombings_df.groupby([ 'countryflyingmission', 'milservice','tgtcountry']).size().reset_index(name='mission_count')
In [77]:
fig = px.sunburst(
    sunburst_df,
    path=[ 'countryflyingmission', 'milservice','tgtcountry'],
    values='mission_count',
    title="Distribution of Missions by Military Service on each Target",
    color='mission_count',  # Color based on complaint count
    color_continuous_scale='Sunset',  # Adjust the color scale
)

fig.update_layout(
    margin=dict(t=60, l=15, r=15, b=15),
    width=1000,  # Set the width of the figure
    height=800  # Set the height of the figure
)

print('Interactive Map: Click on each element to expand and contract the segments of sunburst chart')

fig.show()
Interactive Map: Click on each element to expand and contract the segments of sunburst chart

Narrative:¶

The dynamics of cross-border military operations are laid bare through the interactive sunburst plot. Each segment represents a military service, and as users click and explore, the intricate details of how these services targeted different countries unfold. This immersive experience provides insights into the interconnected strategies of various nations.

Design Choices:¶

Interactive Sunburst Plot for Military Services Attacking Different Target Countries

  • Visualization Type: Sunburst plot.
  • Color Encoding: Different colors for each military service.
  • Interaction: Users can click on different segments of the sunburst to drill down into specific details.
  • Legend: Positioned for clear association.
  • Labels: Clear labels for the title and segments of the sunburst.
  • Tooltip: Include informative tooltips to enhance user understanding.
  • User Guidance: Provide a brief instruction or guide on how to interact with the sunburst plot for a seamless user experience.

6.5 Operations: Can we identify trends or patterns in the types of operations conducted?¶

Visualize operations, unraveling their primary objectives. Examine correlations between specific operations and pivotal historical events or policy changes, unraveling the threads of strategic decision-making.

In [78]:
# Calculate the number of missions for each weapon type
missions_count = bombings_df.groupby('operation_grp')['thor_data_viet_id'].count().reset_index()
missions_count = missions_count[missions_count['operation_grp'] != 'UNNAMED']

# Sort the DataFrame by mission count in descending order
missions_count = missions_count.sort_values(by='thor_data_viet_id', ascending=False)

# Select only the top 10 rows
top_10_missions = missions_count.head(10)

# Create a bar chart using Plotly Express
fig = px.bar(top_10_missions, 
             x='operation_grp', 
             y='thor_data_viet_id', 
             title='Number of Missions in Top-10 Operations',
             labels={'operation_grp': 'Operation Name', 'thor_data_viet_id': 'Number of Missions'}
            )

# Sort the x-axis categories by the number of missions
fig.update_layout(xaxis_categoryorder='total descending')

# Show the plot
fig.show()

Narrative:¶

Operations form the backbone of military strategies. The bar chart introduces us to the top-10 operations, offering a glimpse into their frequency and significance. As we delve deeper, we'll explore how these operations evolved over time, shedding light on the strategic decisions that shaped the course of the war.

Design Choices:¶

Bar Chart for Number of Missions in Top-10 Operations

  • Visualization Type: Bar graph.
  • Color Encoding: No color encoding, simple representation.
  • Annotations: Label on each bar for precise counts.
  • Axes: Ordered by the number of missions.

6.5.1 Operations: How does different Operations evolve over time during war period?¶

Navigate the landscape of various operations, capturing their evolution throughout the Vietnam War. Utilizing a racing bar chart, observe the dynamic shifts in the frequency of different operations, identifying key periods of strategic change. This visualization provides insights into the flow of military strategies, allowing for a nuanced understanding of how operational priorities shifted over time.

In [79]:
operations_by_year = bombings_df[['msnyear', 'operation_grp']].value_counts(dropna=False).reset_index()

operations_by_year = operations_by_year[operations_by_year['operation_grp'] != 'UNNAMED']
In [80]:
operations_by_year = operations_by_year.rename(columns={0: 'ops_count'})
In [81]:
# Create a pivot table to fill in missing values with 0
pivot_table2 = operations_by_year.pivot_table(index='msnyear', 
                                              columns='operation_grp', 
                                              values='ops_count', 
                                              fill_value=0).reset_index()

# If needed, flatten the DataFrame
pivot_table2 = pivot_table2.melt(id_vars='msnyear', var_name='operation_grp', value_name='ops_count')

# Sort the pivot table
pivot_table2 = pivot_table2.sort_values(by=['msnyear', 'ops_count'], ascending=[True, False])
In [82]:
my_raceplot = barplot(pivot_table2, 
                      item_column='operation_grp', 
                      value_column='ops_count', 
                      time_column='msnyear',
                      top_entries = 10)

print('Animated Map: Use Play button at bottom to visualize with animation.')

my_raceplot.plot(title = 'Top Operations per Year',
                 item_label='Operation Name', 
                 value_label='Number of Operations', 
                 time_label = 'Year: ',
                 frame_duration=2000)
Animated Map: Use Play button at bottom to visualize with animation.

Narrative:¶

The racing bar chart becomes a time machine, guiding us through the evolving landscape of operations. Watch as different missions take center stage, reflecting the ebb and flow of strategic priorities. Engage with the play-pause button for an interactive journey through the dynamic evolution of operations.

Design Choices:¶

Racing Bar Chart for Evolution of Operations Over Time

  • Visualization Type: Racing bar chart.
  • Color Encoding: Different colors for each Operations.
  • Animation: Playful and informative racing animation for temporal evolution.
  • Labels: Clear labels for the axes and title.
  • Play-Pause Button: Include a user-friendly play-pause button for interactive engagement.

6.6 Weapons and Aircrafts: What types of weapons and aircrafts were predominantly used during the bombings?¶

Witness the evolution of weaponry and aircraft deployment. Unearth patterns in weapon choices and aircraft types, painting a vivid picture of technological advancements and strategic adaptations throughout the Vietnam War.

In [83]:
# Calculate the number of missions for each weapon type
missions_count = bombings_df.groupby('weapontype')['thor_data_viet_id'].count().reset_index()

# Sort the DataFrame by mission count in descending order
missions_count = missions_count.sort_values(by='thor_data_viet_id', ascending=False)

# Select only the top 10 rows
top_10_missions = missions_count.head(10)

# Create a bar chart using Plotly Express
fig = px.bar(top_10_missions, 
             x='weapontype', 
             y='thor_data_viet_id', 
             title='Number of Missions per Top-10 Weapons',
             labels={'weapontype': 'Weapon Type', 'thor_data_viet_id': 'Number of Missions'}
            )

# Sort the x-axis categories by the number of missions
fig.update_layout(xaxis_categoryorder='total descending')

# Show the plot
fig.show()

Narrative:¶

The arsenal of war takes center stage as we explore the top-10 weapons. The bar charts reveal the predominant choices, reflecting technological advancements and strategic adaptations.

Design Choices:¶

Bar Chart for Number of Missions for Top-10 Weapons

  • Visualization Type: Bar graph.
  • Color Encoding: No color encoding, simple representation.
  • Annotations: Label on each bar for precise counts.
  • Axes: Ordered by the number of missions.
In [84]:
# Calculate the number of missions for each aircraft
missions_count = bombings_df.groupby('aircraft_name')['thor_data_viet_id'].count().reset_index()

# Sort the DataFrame by mission count in descending order
missions_count = missions_count.sort_values(by='thor_data_viet_id', ascending=False)

# Select only the top 10 rows
top_10_missions = missions_count.head(10)

# Create a bar chart using Plotly Express
fig = px.bar(top_10_missions, 
             x='aircraft_name', 
             y='thor_data_viet_id', 
             title='Number of Missions per Top-10 Aircraft',
             labels={'aircraft_name': 'Aircrafts', 'thor_data_viet_id': 'Number of Missions'}
            )

# Sort the x-axis categories by the number of missions
fig.update_layout(xaxis_categoryorder='total descending')

# Show the plot
fig.show()

Narrative:¶

The arsenal of war takes center stage as we explore the top-10 aircrafts. The bar charts reveal the predominant choices, reflecting technological advancements and strategic adaptations.

Design Choices:¶

Bar Chart for Number of Missions for Top-10 Aircrafts

  • Visualization Type: Bar graph.
  • Color Encoding: No color encoding, simple representation.
  • Annotations: Label on each bar for precise counts.
  • Axes: Ordered by the number of missions.

6.6.1 Weapons and Aircrafts: How does the choice of weapons evolve over the course of the war?¶

Witness the racing bar chart depicting the evolution of weapon choices over time. Track the rise and fall of different weapon types, providing a chronological perspective on the shifting preferences in the Vietnam War.

In [85]:
weapons_by_year = bombings_df[['msnyear', 'weapontype']].value_counts(dropna=False).reset_index()
In [86]:
weapons_by_year = weapons_by_year.rename(columns={0: 'mission_count'})
In [87]:
# Create a pivot table to fill in missing values with 0
pivot_table2 = weapons_by_year.pivot_table(index='msnyear', 
                                           columns='weapontype', 
                                           values='mission_count', 
                                           fill_value=0).reset_index()

# If needed, flatten the DataFrame
pivot_table2 = pivot_table2.melt(id_vars='msnyear', var_name='weapontype', value_name='mission_count')

# Sort the pivot table by 'msnyear' and 'total_tonnage'
pivot_table2 = pivot_table2.sort_values(by=['msnyear', 'mission_count'], ascending=[True, False])
In [88]:
my_raceplot = barplot(pivot_table2, 
                      item_column='weapontype', 
                      value_column='mission_count', 
                      time_column='msnyear',
                      top_entries = 10)

print('Animated Map: Use Play button at bottom to visualize with animation.')

my_raceplot.plot(title = 'Number of Missions by Top-10 Weapons per Year',
                 item_label='Weapon Name', 
                 value_label='Number of Missions', 
                 time_label = 'Year: ',
                 frame_duration=2000)
Animated Map: Use Play button at bottom to visualize with animation.

Narrative:¶

The racing bar chart transforms into a visual timeline, illustrating the dynamic evolution of weapon choices. Each race represents a weapon's journey through time, offering insights into how strategic preferences shifted over the course of the war. Engage with the play-pause button to navigate through this chronological narrative.

Design Choices:¶

Racing Bar Chart for Evolution of Weapons Over Time

  • Visualization Type: Racing bar chart.
  • Color Encoding: Different colors for each Weapon.
  • Animation: Playful and informative racing animation for temporal evolution.
  • Labels: Clear labels for the axes and title.
  • Play-Pause Button: Include a user-friendly play-pause button for interactive engagement.

6.6.2 Weapons and Aircrafts: How does the choice of aircrafts evolve over the course of the war?¶

Dive into the racing bar chart showcasing the evolution of aircraft choices throughout the war. Uncover the changing landscape of aircraft deployment, illustrating how technological advancements and strategic considerations influenced the selection of aircraft types over time.

In [89]:
aircraft_by_year = bombings_df[['msnyear', 'aircraft_name']].value_counts(dropna=False).reset_index()
In [90]:
aircraft_by_year = aircraft_by_year.rename(columns={0: 'mission_count'})
In [91]:
# Create a pivot table to fill in missing values with 0
pivot_table3 = aircraft_by_year.pivot_table(index='msnyear', 
                                           columns='aircraft_name', 
                                           values='mission_count', 
                                           fill_value=0).reset_index()

# If needed, flatten the DataFrame
pivot_table3 = pivot_table3.melt(id_vars='msnyear', var_name='aircraft_name', value_name='mission_count')

# Sort the pivot table by 'msnyear' and 'total_tonnage'
pivot_table3 = pivot_table3.sort_values(by=['msnyear', 'mission_count'], ascending=[True, False])
In [92]:
my_raceplot = barplot(pivot_table3, 
                      item_column='aircraft_name', 
                      value_column='mission_count', 
                      time_column='msnyear',
                      top_entries = 10)

print('Animated Map: Use Play button at bottom to visualize with animation.')

my_raceplot.plot(title = 'Number of Missions by Top-10 Aircrafts per Year',
                 item_label='Aircraft Name', 
                 value_label='Number of Missions', 
                 time_label = 'Year: ',
                 frame_duration=2000)
Animated Map: Use Play button at bottom to visualize with animation.

Narrative:¶

The skies above Vietnam tell a story of technological prowess and strategic evolution. The racing bar chart for aircrafts unfolds the narrative of changing preferences and technological adaptations. Engage with the play-pause button to witness the chronological journey of aircraft choices, providing a unique perspective on the war's aerial dimension.

Design Choices:¶

Racing Bar Chart for Evolution of Aircrafts Over Time

  • Visualization Type: Racing bar chart.
  • Color Encoding: Different colors for each Aircraft.
  • Animation: Playful and informative racing animation for temporal evolution.
  • Labels: Clear labels for the axes and title.
  • Play-Pause Button: Include a user-friendly play-pause button for interactive engagement.

7. Design Rationale¶

Our approach to visualizing the intricate history of the Vietnam War bombings is rooted in a strategic blend of clarity, historical context, and interactive engagement, ensuring a comprehensive exploration of this complex narrative.

Geographical Distribution:

The journey begins with "Geographical Distribution," employing an animated scatter map to dynamically capture the evolving patterns of bombings across Vietnam and its neighboring regions. The design choice of an additional static map, paralleled with the animated view, strategically provides a consolidated visualization of total bombings in each target country. This dual-mapping approach enhances the audience's understanding of the spatial dynamics while facilitating a comparative analysis.

Temporal Patterns:

"Temporal Patterns" unfolds with a bar graph illustrating yearly bombing strikes, offering a high-level temporal overview. The subsequent line charts dive deeper into specific temporal patterns, such as notable spikes or lulls in bombing intensity by countries, and the dynamic involvement of allied nations, with a skillful exclusion of the USA from the analysis. An interactive line chart correlates temporal patterns with major historical events, providing users with a personalized exploration of the data's temporal narrative. These design choices empower users to unravel the chronological nuances of the Vietnam War.

Tonnage:

In the "Tonnage" section, a bubble chart vividly emphasizes the magnitude of bombs dropped, offering an immediate grasp of the scale of the conflict. Follow-up questions are addressed through a bar graph showcasing yearly tonnage, allowing users to discern periods of escalation or de-escalation. Interactive racing bar charts are introduced to explore tonnage by target country and tonnage by target type, providing granular insights into tonnage variations over time and across different contexts. These visualizations not only capture the sheer volume of bombings but also reveal nuanced patterns in targeting strategies.

Military Services:

The exploration of "Military Services" begins with bar charts revealing the number of attacks by different military forces. This sets the stage for a dynamic exploration of military service involvement through a racing bar chart. The subsequent introduction of an interactive sunburst plot dissects how military services of each country targeted different nations. This multi-layered approach offers users a detailed understanding of the evolving dynamics among military services throughout the war, fostering a nuanced comprehension of their contributions.

Operations:

"Operations" is unfolded through bar charts revealing the number of missions in top-10 operations. A racing bar chart follows, portraying the evolution of different operations over time. These visualizations allow users to identify trends or patterns in the types of operations conducted, unraveling the threads of strategic decision-making. The design choices aim to provide a comprehensive view of operational priorities and shifts in strategic focus.

Weapons and Aircrafts:

The exploration of "Weapons and Aircrafts" is facilitated by bar charts and racing bar charts, depicting the number of missions for top-10 weapons and aircraft. These visualizations offer insights into the evolution of weaponry and aircraft deployment, painting a vivid picture of technological advancements and strategic adaptations over the course of the war. The design choices ensure that users can witness the changing landscape of weapon and aircraft choices with a chronological perspective.

This diverse set of visualizations is meticulously crafted to cater to both seasoned analysts and those approaching the Vietnam War history for the first time. The interplay of static and interactive elements fosters a nuanced and engaging exploration, allowing users to uncover the multifaceted dimensions of this pivotal historical event.

8. Development Process¶

Our project's development was meticulously structured, adhering to a well-defined timeline and efficient task distribution.

This project is developed by Satyam Shrivastava and Pritish Arora, both played a pivotal role in the successful execution of this project. We both are Data Science graduate students at Khoury College of Computer Sciences - Northeastern University.

Timeline:¶

Week 11: Data Acquisition

During this phase, we collaboratively searched for a suitable dataset, ensuring it aligns with our project goals. After thorough exploration, we agreed upon a dataset that provided comprehensive insights into the Vietnam War bombings.

Week 12: Data Analysis

With the dataset at hand, we independently conducted Exploratory Data Analysis (EDA) to gain a nuanced understanding of the information available. This step allowed us to identify potential visualization questions and lay the groundwork for our project's narrative.

Week 13: Peer Review

After completing the initial analysis, we shared our work in progress for the peer review process.

Week 14: Feedback Implementation

Building upon the received feedback, we implemented necessary improvements to enhance the clarity, coherence, and effectiveness of our visualizations. This phase involved refining visual encodings, addressing design considerations, and ensuring the overall flow of the project.

Week 15: Final Project Completion and Validation

In the concluding week, we collaborated to combine our individual analyses into a cohesive visualization story. This collaborative effort involved synthesizing insights, refining the narrative, and ensuring a seamless flow between different sections of the project. The final validation phase involved confirming the accuracy of visualizations, checking for any inconsistencies, and preparing the project for publication.

Task Distribution:¶

Data Acquisition and Initial Analysis:

We jointly searched for and decided on the dataset, identified project goals, and conducted independent Exploratory Data Analysis (EDA) to formulate initial visualization questions.

Visualization Question Formulation:

The six visualization questions were divided equally between Satyam and Pritish, with each team member taking ownership of three questions. This approach ensured a balanced distribution of tasks and allowed for focused exploration.

Peer Review and Feedback Implementation:

Both team members actively participated in the peer review process, providing valuable insights and suggestions for improvement. The feedback received was then collaboratively implemented to refine the visualizations.

Visualization Story and Publication:

The final stage involved combining our work to create a cohesive visualization story. We collaborated on writing the narrative, ensuring a smooth transition between different sections, and published the project for sharing with our target audience.

Tools for Analysis:¶

Python served as the primary programming language for our analysis, with Jupyter Notebooks providing an interactive and integrated workspace. Pandas was instrumental for data manipulation and preprocessing, facilitating tasks such as cleaning, transformation, and loading. For visualization, we leveraged the capabilities of Matplotlib and Plotly to create a diverse range of static and interactive plots, charts, graphs, and maps. These advanced visualization tools, combined with the skills acquired during our course, enabled us to craft a compelling and informative narrative, elevating the overall quality of our project.

This systematic and collaborative approach, coupled with the effective use of tools and a well-structured timeline, contributed to the successful development of our Vietnam War bombings visualization project.

9. Feedback Incorporation¶

The received feedback from peers has been instrumental in refining and enhancing the Vietnam bombing project's visualizations, providing valuable insights that significantly contributed to the project's overall impact.

Feedback 1 Incorporation (from Mehul Jain):¶

Visual Encodings:

  • The sunburst plot received praise for effectively showcasing the breakdown of military services by country. The animated choropleth was acknowledged for dynamically illustrating the changing geographical distribution of bombing incidents over time.

  • Addressing improvement suggestions, a time series element was incorporated into the bar plot representing military services, enabling users to visualize the evolution of different military services' involvement over the course of the war.

  • The suggestion to delve deeper into the data by analyzing the breakdown of aircraft and weapons used by the top-10 contributors was implemented, offering additional insights into the bombing campaign's tactics.

  • While data on the types of targets bombed wasn't initially included, the feedback sparked consideration for its addition, recognizing its potential to add another layer of understanding to the visualization.

  • The recommendation to create a narrative alongside the visualizations was embraced, aligning historical events and key statistics to enhance the storytelling potential and provide context for viewers.

Design Quality:

  • Acknowledging the clear titles, labels, and legends enhancing visualization clarity, credibility validation with reliable sources was suggested and implemented.

  • The idea of adding interactive features, such as tooltips, was recognized as a potential enhancement for user engagement and information accessibility.

  • The recommendation to integrate historical events into the visualizations through text annotations, timeline representations, or interactive elements was welcomed and adopted to create a more cohesive and insightful story about the Vietnam War bombings.

Feedback 2 Incorporation (from Bishal Agrawal):¶

Visual Encodings:

  • The animated map's effectiveness in dynamically unfolding the geographical distribution of bombing incidents over time was highlighted. The utilization of bar graphs and line charts for communicating temporal patterns and tonnage was commended.

  • The idea of incorporating temporal patterns linked to significant operations through a racing bar chart animation was acknowledged, contributing to a richer visualization that contextualizes historical events.

  • Considering alternative visual encodings, such as heatmaps or density maps, and implementing interactive elements like filters for time periods or target countries were recognized as potential avenues for further exploration. However, heatmaps and density maps were taking significant memory of the platform due to data size, making it slow, so it was discarded.

Design Quality:

  • The overall design's impressive guidance through various aspects of the bombing campaign was noted. The structured organization and the inclusion of appropriate legends, titles, and axis names were acknowledged.

  • The suggestion for improved consistency and cohesion in color mapping across all charts was addressed, creating a visually unified and aesthetically pleasing display.

  • Integrating visual elements like annotations or highlights to denote significant events was seen as a valuable addition to enrich the visualization.

In conclusion, the integration of peer feedback has significantly elevated the Vietnam bombing project's visualizations, ensuring a more comprehensive, engaging, and insightful exploration of the historical context and repercussions of the bombing campaign. The collaborative input has not only improved the visual appeal but also deepened the project's value as a resource for understanding the complexities of the Vietnam War.

10. Final Thoughts¶

Reflection:¶

Embarking on the Vietnam bombing project has been an enlightening journey, providing invaluable insights into the intricacies of data visualization and storytelling. As I reflect on this learning experience, several key takeaways and considerations come to the forefront.

The project offered a hands-on opportunity to apply theoretical concepts learned throughout the course, transforming abstract ideas into tangible, impactful visualizations. Navigating the complexities of the Vietnam War dataset underscored the importance of thoughtful design choices, effective data analysis, and the power of storytelling through visuals.

One of the most significant learnings was the iterative nature of the data visualization process. The incorporation of peer feedback played a pivotal role in refining the visualizations, highlighting the collaborative and dynamic nature of this field. Recognizing the impact of feedback on the project's depth and clarity emphasized the importance of seeking diverse perspectives for a more comprehensive outcome.

The utilization of various visualization techniques, such as animated maps, racing bar charts, and interactive elements, deepened my understanding of the diverse tools available for conveying complex information. This project underscored the significance of not only choosing the right visual encodings but also incorporating interactive features to enhance user engagement and comprehension.

Potential Future Improvements:¶

In future projects, a more structured approach to incorporating historical context from the outset could be considered. While the narrative was eventually enriched with historical events, weaving them seamlessly into the initial stages of the project could provide a more cohesive storyline from the start.

Moreover, exploring the integration of additional data sources could enhance the depth and context of future projects. Merging data from other wars or contrasting it with contrasting datasets could provide a comparative perspective, unraveling patterns and insights that might not be apparent in isolation.

Additionally, investigating alternative visual encodings and interactive features at an early stage might offer a broader spectrum of insights. The awareness that different visual representations could unveil nuances not immediately apparent in the chosen visualizations could be integrated into the initial design considerations.

Furthermore, expanding the scope to include a multi-dimensional analysis, such as the social, economic, or political impacts of bombing campaigns, could contribute to a more holistic understanding of historical events. Integrating diverse dimensions could add layers of complexity to the visual narratives, providing a richer and more nuanced exploration.

Conclusion:¶

Overall, this project has been a rewarding exploration into the fusion of data, technology, and storytelling. The iterative process, coupled with the collaborative nature of feedback incorporation, has not only refined my technical skills but also deepened my appreciation for the art and science of data visualization. As I move forward, these insights will undoubtedly shape my approach to future projects, fostering a commitment to continual learning, adaptability, and a holistic understanding of the narratives behind the data.

11. Acknowledgments¶

This project would not have been possible without the generous contribution of various data sources, tools, and inspirations that fueled our exploration into the complexities of the Vietnam War bombings. We extend our sincere gratitude to those who played a pivotal role in shaping this project.

First and foremost, we express our appreciation to Theater History of Operations (THOR): the creators and maintainers of the Vietnam War Database, the primary data source for this project. The meticulous curation and documentation of this dataset laid the foundation for our analysis, providing a comprehensive and reliable resource for understanding the historical context of the Vietnam War bombings.

Our approach to data visualization was greatly influenced by the teachings and insights gained from the data visualization course, and we acknowledge the valuable guidance provided by our instructor: Dr. Lace Padilla and TA: Harshini Chandrika Dasri. The knowledge imparted throughout the course empowered us to navigate the nuances of visual storytelling and effectively communicate complex information.

We would also like to thank our peers who actively participated in the peer review process. Their constructive feedback and thoughtful critiques played a crucial role in refining our visualizations and elevating the overall quality of the project. The collaborative spirit within the learning community significantly contributed to the iterative development of our visual narrative.

The tools and libraries employed in this project deserve acknowledgment for their role in translating our ideas into compelling visualizations. We express our gratitude to Python, Jupyter Notebook, Matplotlib, Plotly, and Pandas for providing a robust and flexible environment for data analysis and visualization.

Lastly, we draw inspiration from the broader community of data enthusiasts, researchers, and storytellers who continually push the boundaries of data visualization. The collective efforts of this community inspire us to explore innovative approaches and strive for excellence in our own work.

References:

  1. Visualizing the Vietnam War 1954 - 1975
  2. Bombing Missions of the Vietnam War
  3. Vietnam bombing history with data

In conclusion, we extend our thanks to all individuals and resources that have played a part in the realization of this project. Your contributions have enriched our learning experience and have been instrumental in the development of this visual narrative.