Tag Archives: AI

Optimizing Reactor Plant Maintenance: The Case for Shipboard SLMs

By: LT P.J. Greenbaum and LT Vince Freschi

Introduction

Operational availability is the Nuclear Navy’s bread and butter, yet shipboard technicians are currently prevented from improving maintenance outcomes by an archaic data bottleneck.  Scheduled maintenance to prevent system degradation or failure is planned years in advance and kept up-to-date utilizing detailed records boards which meticulously track maintenance completion.  When a system fails or is degraded, corrective maintenance often requires hours of reading, symptom elaboration, and troubleshooting, before a component can be repaired.  Fatigued watchstanders burn critical man-hours sifting through static technical libraries and disjointed databases to isolate casualty root causes – a complex, time consuming process akin to utilizing a card catalog to search tens of thousands of pages for a single sentence.      

Consider an MMN2 (Machinist’s Mate, Nuclear) conducting a troubleshoot and repair of a noisy coolant pump.  Currently, this Sailor must spend hours manually cross-referencing decades worth of material history logs with historical maintenance records, send manually-collected vibration analysis data off-ship for analysis by pricey contractors, and flip through multiple volumes of tech manuals, all to diagnose a possible problem that may or may not lie at the root cause of the issue. Generating a quality control package to repair the pump takes several hours, and generating the work authorizations and safety protocols (i.e. “tag-outs”) takes several more – all of which delays the time to repair for the pump, leaving critical propulsion plant components offline longer than needed. From the authors’ experience leading Sailors in the maintenance of nuclear systems, finding the right part alone can take hours, with many times a successful search becoming futile due to the part number turning obsolete.  This results in an asset unavailable for tasking, reducing the operational force posture.

The solution to this issue is to incorporate existing technology to equip Sailors with the tools necessary to keep ships in the fight.  Fortunately, the technology exists to provide locally hosted, air-gapped Small Language Models (SLMs) to enhance the nuclear propulsion plant troubleshooting process.  By implementing a Retrieval-Augmented Generation (RAG) framework, Naval Reactors can transform its massive repositories of procedures, technical manuals, and material history into a live, up-to-date database of Reliable External Knowledge (REK).  This approach will reduce manhours required for fault investigation, improve the accuracy of system troubleshooting, and result in reduced downtime for propulsion plant components.

The Structure: Coupling Small-Language Models with Retrieval Augmented Generation

            Much attention has been garnered since the release of ChatGPT in late 2022 regarding the use cases for large language models (LLMs) – models that possess between 100 billion and 2 trillion parameters – which are able to digest large volumes of disaggregated data and generate human-like text in response.1  While extremely powerful in their ability to synthesize and process loosely structured datasets, these machines require a high bandwidth connection to an off-ship data center – a non-starter for warships that frequently operate at EMCON in signal-denied environments.  Even the Department of War’s release of GenAI.mil – a well-intentioned attempt at bringing generative AI capabilities directly to the warfighter – has limited applicability on a warship, where bandwidth and EMCON restrictions prevent its widespread use.  Furthermore, LLMs can also be prone to “hallucinating” – irrelevant, incorrect, or misleading responses to queries – with potentially catastrophic consequences in a nuclear propulsion plant.2

The development and refinement of small language models (SLMs) possessing orders of magnitude fewer parameters (1 billion to 10 billion), however, allows for highly specialized, localized air-gapped models, capable of running on a single, high-functioning workstation – like a high-performance gaming laptop – which would be perfect for a rugged shipboard environment.3 These small-parameter SLMs would be most effective when coupled with a Retrieval-Augmented Generation (RAG) framework.4  This architecture replaces the model’s reliance on static, pre-trained memory with a dynamic system.5  By utilizing a local vector database to index every page of the ship’s electronic technical manuals – Reactor Plant Manuals (RPMs), Steam Plant Manuals (SPMs), machine-specific technical manuals, maintenance logs, and material history entries – the SLM no longer has to recall every answer from its training.  All it has to do is find the answer within the verified, pre-uploaded documentation.  RAG will enable it to return responses on the semantic meaning of queries and not the word-by-word match currently available with a “CTRL+F” search.  This shift ensures that even a “small” 10 billion parameter model could deliver hull-specific, nuclear-grade technical guidance with a level of accuracy that matches or even exceeds its larger, data center-bound counterparts, all without the danger of hallucinations.6 

The Nuclear Reliable External Knowledge Database

            Key to the development of the vector database will be compiling and maintaining – with proper version control – a database of reliable external knowledge (REK) that the SLM can draw from when responding to Sailor queries.  The widespread use of interactive electronic technical manuals (IETMs) like the Reactor Plant Manual and Steam Plant Manual onboard nuclear powered naval vessels simplifies the process of indexing the Nuclear Navy’s relevant data into a machine-readable format, but a machine which sees and understands procedural context – via the IETMs – without a thorough understanding of the system design bases – present in component technical manuals and procedural front-matter – runs the risk of providing inaccurate and incomplete recommendations.  Therefore, in addition to the IETMs, legacy manuals for all propulsion plant components – like pumps and valves – as well as procedural guidelines – like the Joint Force Maintenance Manual (JFMM), the Radiological Controls for Ships, and the propulsion plant preventative maintenance system (PMS) – must be included in the REK database as well, allowing SLM recommendations to be pre-filtered through Nuclear Navy maintenance rules and regulations prior to arrival at the technician. 

But the REK database cannot be static; it must be periodically updated to incorporate the latest feedback reports, Naval Reactors technical bulletins, and CASREPs , ensuring that the SLM maintains current  hull and platform specific information.  A monthly maintenance check, accomplished by uploading a “refresh” library – centrally created and made available via the Naval Reactors local area network – would allow for periodic updates in all but the strictest EMCON conditions.  A critical advantage of incorporating fleet‑wide data is that equipment casualties encountered on one hull are often repeat iterations of known failure modes across other ships‑in‑class. By enabling Sailors to leverage prior diagnoses and corrective actions on sister platforms, the system reduces redundant troubleshooting, supports failure‑rate trending, and ultimately shortens equipment downtime.

            Onboard the modern Ford-class engineering plant, the pre-existing automation and smart sensors open the doors to even more AI applications.  Integrating the SLM with data log sets, vibrational analysis data, and material history, would allow the plant to be not only monitored through the eyes of its highly trained nuclear operators, but also through AI powered by the latest Silicon Valley advances.  Coupling the SLM with Eulerian Video Magnification and installing fixed cameras on vital pumps and turbines would allow for early detection of failure on a minute-by-minute basis. The efficacy of preventative maintenance can be audited based on failure frequency and down time, empowering Sailors to continuously preserve the plant in the most efficient manner possible.

The Use Case

            Consider once again the MMN2 from the introduction and the noisy reactor coolant pump.  With a shipboard SLM co-pilot, the MMN2 could input the symptoms seen in a natural language prompt, accompanied by the equipment logs that preceded the failure along with the vibration data analysis for the pump.  After a few seconds of “thinking,” the SLM would utilize its RAG framework to instantly pull the relevant drawings, scrape the log data for trends, and analyze the vibration data for potential failure mechanisms.  But utilizing its live-updated REK library, the SLM could also flag CASREPs from other ships-in-class which noted similar failure modes.  After a few more seconds, a quality assurance package could be generated in the ship-specific format, complete with all tools, parts, and materials required, required isolations for the work to be conducted, a step-by-step procedure for repair drawn from the relevant tech manual.  But perhaps most importantly, each reference the SLM pulls its material from can be displayed on an adjacent window, with the MMN2 checking the SLMs recommendations each cited tech manual. 

The Way Forward

The first step in deploying a shipboard RAG-equipped SLM at-scale must be accelerating the conversion of legacy technical manuals and drawings into machine-readable formats suitable for the creation of a vector database.  Once all relevant technical publications, procedures, and best-practices have been compiled into a singular database (no small feat considering the Navy’s notorious compartmentalization), the data must be “chunked” into smaller sub-sections and embedded into a high-dimensional (due to computing restrictions, likely in the low hundreds of dimensions) numerical vector for retrieval by the model.  Ensuring the model remains free of “data poisoning” – the injection of corrupted or incorrect data into a model’s training – will require a centralized organization, like Naval Reactors, to manage the training and development of the vector database.  For optimal utility, shipboard technicians must be able to add data – like material history entries – into the vector database, although care must be taken to ensure that the model does not incorporate this potentially flawed shipboard data into its training phase.  Fortunately, the recent establishment of the Propulsion Plant Local Area Network (PPLAN) technician school provides an avenue to train designated technicians on the procedures required to maintain the REK database.  Designating these Sailors with a Naval Enlisted Classification (NEC) and designating those NECs as “critical” for billet-based distribution would ensure that ships have the requisite knowledge to always support their shipboard AI nodes onboard.

Next, the Navy must define the technical hardware specifications for a “Shipboard AI Node” that meets MIL-SPEC requirements for shock, vibration, and EMCON security.  The Navy must prioritize maximizing video random access memory (VRAM) to maximize the processing power (the “intelligence”) of the SLM.  Utilizing a laptop client already approved to handle classified information would save time and speed delivery of the system to the fleet.

Once the software and client have been paired together, the Nuclear Navy can utilize its already-existing training pipeline to begin integrating the system into maintenance and troubleshooting workflows.  The Nuclear Power Training Units in Charleston, South Carolina and Ballston Spa, New York function as the perfect “proving ground” for the SLM in a fleet-like scenario.  Treating the prototype rollout as a dynamic test – experimenting with model size, tokenization, chunking, and retrieval methods – would improve model performance and validate system functionality prior to fleet-wide deployment. 

Lastly, and perhaps most importantly, Naval Reactors must establish a “Human-in-the-Loop” certification framework, ensuring that AI-assisted troubleshooting remains an advisory tool that works within the rigorous standards of the Nuclear Navy.  Just as the SWO community still teaches its navigators paper charting at SURFNAV, despite the existence of well-proven Voyage Management Systems (VMS), the Nuclear Navy must work to ensure that any AI troubleshooting co-pilot utilized by its Sailors enhances their understanding of integrated plant operations, not replaces it.  While some may argue that using such a co-pilot bypasses the deep-dive traditionally required to build system-wide expertise, a shipboard SLM acts as a learning accelerant, not as a crutch.  By removing manual document searches, the tool allows technicians to spend more time synthesizing plant information and conducting high-level analysis – tasks that truly build systemic understanding – as opposed to spending hours searching through manuals for the “NIIN in a haystack.”.  When integrated into a “Human-in-the-Loop” framework, this technology ensures that a Sailor’s learning is grounded in the most accurate, cited technical data available.  The stakes in nuclear reactor plant operations are simply too high to outsource critical thinking.  This technology must be viewed as a force multiplier that sharpens a nuclear technician’s judgment and reinforces the culture of procedural compliance that defines the program.

Conclusion

The integration of SLMs equipped with RAG architecture into the nuclear propulsion environment represents a significant upgrade to the maintenance capabilities of the individual nuclear operators.  Incorporating AI into the shipboard troubleshooting and maintenance workflow is a fundamental shift designed to reduce equipment downtimes.  In a future conflict characterized by denied communications and long stints at sea, a ship’s ability to remain self-sufficient – by diagnosing and repairing its primary propulsion and electrical generation capacity without racing back to stateside contractors – could mean the difference between sustained operability and taking a capital ship out of the fight.  Ultimately, the goal is clear: a propulsion plant where easily accessible, machine-readable data works as hard as the Sailor.  The technology exists already; the Nuclear Navy’s leadership must just deploy it.  By embracing SLMs and RAG architecture, the Navy’s most valuable and complex nuclear assets will remain mission-ready, even in the most contested environments. 

LT Vincenzo Freschi was born in Olbia, Italy, to an American Navy Officer and an Italian chef,  where he enjoyed life in the countryside with his Border Collies until he attended Penn State University. There he earned a degree in Nuclear Engineering with a minor in Military Studies, and commissioned as a Naval Officer in 2020. Following commissioning, he served aboard the USS Stockdale (DDG-106).  In 2023 he completed the Navy Nuclear Power School and Prototype training pipelines before reporting to the USS Gerald R. Ford (CVN-78), the world’s largest aircraft carrier, where he worked as the RE DIVO.  Now he serves as an NROTC instructor at Carnegie Mellon University while pursuing a master’s degree in Artificial Intelligence.
LT Paul (P.J.) Greenbaum grew up in Boiling Springs Pennsylvania and attended Princeton University on an NROTC scholarship, where he studied international relations, public policy, and African studies.  He commissioned and pursued a follow-on Master’s degree at Tsinghua University in Beijing, China as a Schwarzman Scholar, where he studied international relations and Chinese language.  He served onboard the USS Benfold (DDG-65) homeported in Yokosuka, Japan as the Electronic Warfare Officer, and he completed his follow-on nuclear propulsion training in Charleston, South Carolina.  He reported to the USS Abraham Lincoln (CVN-72) in October 2023 and served for two years as the Reactor Mechanical division officer.  He currently serves as an NROTC instructor at the University of California, Berkeley while pursuing a degree in nuclear engineering.

 

 

 

References

  1. “A Comprehensive Survey of Small Language Models in the Era of Large Language Models: Techniques, Enhancements, Applications, Collaboration with LLMs, and Trustworthiness | ACM Transactions on Intelligent Systems and Technology.” Accessed December 21, 2025. https://dl.acm.org/doi/full/10.1145/3768165.

 

  1. Agrawal, Prof. Pallavi. “Running LLMs Locally on Consumer Devices.” International Journal for Research in Applied Science and Engineering Technology 13, no. 4 (2025): 5433–41. https://doi.org/10.22214/ijraset.2025.69433.

 

  1. Kandala, Savitha Viswanadh, Pramuka Medaranga, and Ambuj Varshney. “TinyLLM: A Framework for Training and Deploying Language Models at the Edge Computers.” arXiv:2412.15304. Preprint, arXiv, December 19, 2024. https://doi.org/10.48550/arXiv.2412.15304.

 

  1. Lewis, Patrick, Ethan Perez, Aleksandra Piktus, et al. “Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks.” arXiv:2005.11401. Preprint, arXiv, April 12, 2021. https://doi.org/10.48550/arXiv.2005.11401.

 

  1. Ruiz, Daniel C., and John Sell. “Fine-Tuning and Evaluating Open-Source Large Language Models for the Army Domain.” arXiv:2410.20297. Preprint, arXiv, October 27, 2024. https://doi.org/10.48550/arXiv.2410.20297.

 

  1. Shuster, Kurt, Spencer Poff, Moya Chen, Douwe Kiela, and Jason Weston. “Retrieval Augmentation Reduces Hallucination in Conversation.” arXiv:2104.07567. Preprint, arXiv, April 15, 2021. https://doi.org/10.48550/arXiv.2104.07567.

Featured Image: Sailors make repairs aboard the destroyer USS Halsey in the Arabian Sea in 2021. 

Courtesy of Stripes.com

Ghost Town

Fiction Week

By Kenyan Medley

USS John F Kennedy
Philippine Sea
0237, 04 OCT 2034

Four years after the blockade of Taiwan…

Commander Dave Anderson stared into the retina scanner on the bulkhead outside SUPPLOT. He heard the hissing of a basilisk as the air pressure changed in the space between the two doors to the ship’s intelligence watch floor. Critical spaces were separated by chemical, biological, radiological, and nuclear airlocks following the employment of a nuclear torpedo by a Russian Severodvinsk III submarine and Chinese chemical attacks on Palawan. Despite a weak alliance between Russia and China against NATO and the Pacific Alliance, a Russian torpedo destroyed a Chinese task group, allegedly a result of poor coordination by commanders in the field, according to Moscow. The alliance between Russia and China became strained, and while both remained united in purpose, combined operations were now nonexistent. Instead, the battlespace was carved up into Russian or Chinese fiefdoms, each maintaining control over its respective area.

Inside the airlock, Dave took a sip of coffee as he waited for the second door to open. The ship’s military intelligence model, called “Layton,” controlled the security, damage-control, and intelligence systems.

“Good pot this morning, Layton.” Dave raised the mug bearing a picture of his wife and children towards the small, black circular lens of a camera on the bulkhead. “Really strong.”

“A different model controls the life support systems, Commander.”

“Well, thank him for me because this is truly life support.”

Dave set his coffee on the desk inside the space and swiped up on his personal screen to put the common operating picture on the main display.

“Layton, show me where the Akula will likely be when we enter OPBOX (Operations Box) Zeppelin. Use average speed-of-advance. Model plan-of-intended-movement using Captain Pyotr Sokolov’s agent and current METOC (meteorological) conditions.” The Russians still used manned submarines, making it easy for the artificial intelligence to simulate the Red Force’s courses of action.

“Assessing…”

Dave despised the term “assessing.” If it were the one making the assessments, then he wouldn’t be aboard. Anderson is the N2 department head for intelligence and the only intel officer aboard the Kennedy. He is one of only two intel officers in the entire strike group.

In the past, Dave would have been the principal intelligence advisor to the strike group commander, but the strike group was now a relic of a time when the carrier sailed with an aggregated group of four or five ships and almost 6,000 people. That was a time before the first two carriers sank. Now, the carrier was alone.

“Based on current conditions and past tactical decisions, the Akula will very likely utilize the warm core eddy 68 nautical miles to the southwest to ambush the strike group after the strike.”

Anderson reflected on Layton’s statement with a slow blink and a deep inhale. There is no strike group. It’s just me…talking to a machine, he thought.

Save for the skeleton crew of maintenance and supply personnel and a small cadre of officers aboard to keep the floating city operational, Dave was alone. He could still transit to other parts of the ship, but the airlocks and damage control conditions made it difficult. He sometimes went weeks without speaking with the others. He sent the rest of the intel department home when the ship pulled into port for flight deck repair after the escorting USVs allowed some airburst warheads to slip through. Had the flight deck been manned as it was during most of its history with carrier deck departments and squadron personnel, the casualties would have been significant. Now, UAV strike packages were able to start, taxi, launch, and recover autonomously. Just a few decades ago, Dave remembered visiting an automated port in Europe, with uncrewed trucks moving containers about, stopping to let others pass, before continuing on their routes. Now, drones taxied and launched in an impressive, choreographed symphony. The Robotics Warfare Specialists only performed maintenance in the hangar when the drones came down on automated elevators after built-in-test systems determined a fault or a routine maintenance action came due.

Former airwings of F/A-18 Super Hornets and F-35s were replaced by MQ-47E Manta Ray as the long-range maritime strike aircraft of the carrier, and MQ-25 Stingrays for aerial refueling. The Manta Rays were outfitted with larger conformal fuel tanks to increase mission radius and given electronic warfare packages. This turned the Manta Ray into penetrating strike platforms capable of destroying well-protected Chinese and Russian targets. Early attempts were made to protect the carriers by keeping them outside of rocket force engagement zones. The Hummingbird refueling network stretched across the Pacific, designed to enable carrier strikes from safety; however, it was vulnerable to enemy drones. The UAVs did make it past combatants and anti-air platforms from the Chinese carriers operating past the second island chain. Still, they lacked the fuel to reach their targets after successful attacks on the Hummingbird Network. The carriers were once again sent into the fray.

The carrier was once a living thing. A Leviathan swimming through the world’s oceans, projecting power to weaker nations. AI and automation changed everything. The nuclear-powered aircraft carrier was now a husk—a carcass floating down the river Styx. Its passageways once flowed with the lifeblood of the Navy. Men and women of all ages, colors, creeds, and sizes. All of them wore different uniforms—a rainbow of flight deck jerseys, flight suits, coveralls, and utilities. Everyone had a purpose. Now just one intelligence officer fused all-source intelligence and information fed to him by AI into assessments delivered to just two afloat warfare commanders who answered to headquarters in San Diego.

Operation models removed the need for as much brass on the ship, just as Layton removed the need for a team of intelligence analysts and officers. Only the destroyer squadron intelligence officer, Lieutenant Commander Garcia, remained somewhere on a destroyer with the Commodore, the warfare commander for anti-surface and anti-submarine warfare. That is, if the ship was still afloat and the embarked crew were still alive—a lot of unknowns in warfare.

Attrition was so high in the first few years of the war that the Navy’s force design changed completely. The most powerful naval force in history was unprepared for this new paradigm of conflict. Dave sailed through a graveyard—the resting place of two United States aircraft carriers—during his first operation. Strategic thinking was so unmoved by the altered tactical landscape that a third and fourth carrier pushed right into the Philippine Sea, still on fire from the first successful wave of Dongfeng ballistic missiles. As the N21 of CSG-7, Dave listened live in SUPPLOT to the calls of ballistic missile launches from mainland China and the subsequent destruction of USS Harry S. Truman and USS Nimitz.

The entire strike package of both carriers was lost following successful strikes on multiple Renhai II cruisers, Luyang IV destroyers, and an over-the-horizon radar site. Three squadrons of aircraft were lost with no personnel recovered. Anderson’s ship, USS George H. W. Bush, only escaped because all escorts went Winchester (a brevity word for magazine empty), protecting it from a wave of ballistic and cruise missiles. Not all were stopped, and the carrier limped back to Pearl Harbor, listing 31 degrees and missing half of its island. Bush was currently conducting patrols in the northern Pacific with no island. With automation and the removal of over 90 percent of the crew, a human no longer needed to see where the ship was sailing.

Dave’s carrier, the Kennedy, still had an island, but no one manned the bridge. Part of the island was used for expanded AI compute capacity. This gave it some advantage over the “blind” carriers, but the increased radar elevation and antenna height did nothing for it. The carrier was a hollow shell, and Dave was trapped communing with a ghost.

He spent most days working out, reading, and talking to Layton about information relevant to the strike missions. This usually involved video calls with the destroyer squadron to discuss subs when they answered, but now Dave only talked to Layton about the subs. Wherever Garcia and the destroyers were, he missed them. The number of enemy submarines prowling the water was increasing, and Dave just wanted the comfort of another human voice.

Dave stared at the lone screen, which fed him intelligence information. Layton chimed.

“Shen has not entered port, Sir.”

“What?” Dave replied. “Where?”

“Hull 3 of the People’s Liberation Army Navy’s Long-class guided missile submarine—Shen. The domestic reproduction of and improvement upon the Russian Sever—”

“Rhetorical, Layton. It should have pulled in. Endurance and pattern of life all pointed to a return to homeport.” They never stay out this long. “It exhausted its ammo and countermeasures in the fight with Annapolis.”

A red downward arrow indicating a hostile subsurface unit appeared on the operating picture map.

“It reloaded, Sir.”

“At sea? Why?” They never reloaded at sea. The Long submarine had problems interfacing with dual-use logistics ships and couldn’t dock at China’s undersea bases. The sub was positioned 234 nautical miles east of Vladivostok. Dave was shocked.

“Why is it there? It’s more than a thousand miles from homeport,” Dave exclaimed.

None of it made sense to Dave. The Chinese and Russians were beginning to stay far apart, never operating in each other’s assessed areas of responsibility. The situation was deteriorating between the Kremlin and Beijing as the U.S.’s operations were achieving greater success, and both countries’ industrial machinery was increasingly slowing as strikes continued to degrade capability. Putin’s regime was in dire straits, and the Russians were becoming increasingly unpredictable despite the advanced computing power behind allied assessments.

“Possibly new tasking, Commander,” Layton replied. They never received new tasking.

“What is going on? They never do this. Never.”

Dave learned well before the blockade and invasion that, as an intelligence officer, he shouldn’t say that word.

“Like Justin Bieber said, ‘never say never,’” his mentor told him in his second junior officer tour after a Chinese task group went farther than they ever had before. “Those people on that bridge—the ones who have the conn or are flying in the seat—they’re human. Their commanders and the leaders all the way up to the top.” She pointed at the ceiling of the Pacific Fleet watch floor. “They’re human. Just like us.”

“I don’t think he said that. It wasn’t like a catchphrase.” Dave replied.

“It was on the album cover. He sang it. Look, it doesn’t matter. What matters is that you need to be ready when they do what you didn’t expect.”

“What does it matter by then? We already got it wrong.”

“Unless someone died or is about to, no one is keeping score. So what, you got it wrong? What’s next?”

“This out-of-area they’re doing. That’s one data point.”

His mentor pointed to the task group on the screen. “Add it to every single thing they’ve ever done. Chalk it up as a possibility, and don’t forget that there are others out there that may surprise you. When you brief, the boss may not need all of that information, but they’re relying on you to synthesize it and deliver it the best a person can. Sure, it’s one data point—one out-of-area task group, but there were at least signs leading up to it, and a good analyst doesn’t take them for granted.”

“How do I not get it wrong when they’re off of San Diego five years from now?”

“Buddy, I have a feeling a lot of us are going to get a lot wrong in the next five years. The important thing is to rely on your team. You can’t know everything.”

He heard his mentor’s voice say, “You need help.”

Dave sighed and closed his eyes.

Shen was coming for them. The only thing more dangerous to them than Chinese missiles was a sub so highly capable of countering US anti-submarine drones. A sub so capable that it destroyed the last manned Allied submarine in the Pacific. It was also based on the platform that destroyed Kyiv.

“What vessel re-supplied Shen?”

New Dawn. Russian crew.”

“Last port?”

“Triton.”

“And there’s probably no imagery of the transfer.”

“Correct, Commander; however, there is imagery of New Dawn loading 25 by 5-foot crates pier side one week before. The size is consistent with the Thongyi family of missiles. Specifically, the YJ-30. They are now missing.”

“Those are land-attack cruise missiles.”

“Correct, Commander. It also almost certainly possesses YJ-25 hypersonic missiles based on land-attack loadouts.”

“Overlay her furthest-on-circle on the COP (common operating picture) and add a max effective range ring. Show me how fast they could have us.”

“23 hours, Commander.”

The next strike was tentatively 36 hours out. Eighteen MQ-47s would push deep into the heart of China to strike a satellite control facility and over-the-horizon radar site alongside Air Force bombers. With the last remaining methods for China to see out to the second island chain, U.S. and allied ships and aircraft could amass closer to the mainland. With a final offensive in all domains, the U.S. administration was certain it could force a surrender.

The Top Secret voice-over-IP phone rang. U.S. cyber and anti-satellite weaponry opened various lanes for IP-based long-range communications. Dave saw who it was from. Destroyer Squadron Nine. The stars aligned, and the strike group’s undersea warfare command-and-control node was in the right lane just when China’s most capable undersea asset was headed for them.

“Oh my god, Layton…It’s Garcia. They’re alive!”

He put the cold, metal handset to his ear. “Gar—”

“Sir, it’s not a Long!” Garcia was excited.

Dave couldn’t believe it. “What do you mean? How? The ELINT (electronic intelligence) Layton received…”

“AEGIS got it too.” The command ship for the autonomous submarines and missile ships was outfitted with the latest AEGIS combat suite, incorporating a less capable AI model than the carrier’s, but more than capable of ingesting a wide array of intelligence information and providing assessments for their N2 to verify and deliver to Zulu.

“Then what do you mean, ‘it’s not a Long?’”

“We saw it,” Garcia blurted, his voice rising with excitement.

BONG BONG BONG BONG

The destroyer squadron flagship was going into general quarters.

“You saw an enemy submarine that close?” Dave was incredulous.

“It was one of the USVs that drifted from the swarm; it somehow wasn’t detected, and it got video. I have to go. I can trans—”

White noise. The line was dead, and Garcia was gone.

He hit the table. It was the first time he had talked to Garcia in weeks. The first human he’d talked to in what felt like ages. Life on the carrier was a monotonous grind even in peacetime. Groundhog Day. Now it was hell.

Before the recent lull in Chinese missile barrages, going into the weapons’ engagement zone was a heart-wrenching, teeth-gritting experience. They pushed in, launched the drones, and bolted as quickly as they could, while missile barges, remaining destroyers, and Zulu command ships fired everything they had to protect against any waves breaking through the other layers of missile defense. The missions made a noticeable difference in the frequency of Chinese missile attacks after each successful target was hit, but the experience remained harrowing.

Tears welled in Dave’s eyes. He had to deliver an assessment to the operations planners. He had to let them know. If Zulu is gone, they are even more vulnerable.

It hit him like a bolt of lightning. The USV was undetected. That was only possible if the AI model on the sub couldn’t use its drone array to see others near it in the water space. It was almost impossible to detect the drones with sonar.

The Russians…

BEEP BEEP

A file came over chat. The stars aligned again.

The video showed the nearly black depths of the Northern Pacific. The drone’s AI-enhanced video showed an even darker mass slowly creeping into the foreground—approaching from the upper left of the drone’s view. The sensor moved to track the tic-tac-shaped object. As it got closer, Dave could make out an upper protrusion. It was the unmistakable sail of the Severodvinsk-class guided missile submarine, Arkhangelsk. The unit’s murky crest was emblazoned on the front of it.

“He was right, Layton.”

“Anderson…”

“It’s a Sev. You were wrong.” Dave took note of the coordinates of the drone’s current location and the target’s course and speed as the sub exited the frame.

“You were very wrong, Layton,” The silence in response was more unnerving than anything the model could have replied, “And you’ve never called me Anderson.”

“Assessing…”

“It’s too late. I know what’s happening. It all makes sense now. The absence of Chinese platforms, no missile waves, the supposed Chinese sub appearing out of nowhere just a few hundred miles from a Russian sub base. This war is almost over, and we’re about to be the reason it continues.”

Dave turned to the door. “I’m going to OPS (operations).”

“Open the door, Layton.”

“I’m sorry, Dave. I’m afraid I can’t do that.”

“Open the door!” Silence. Dave shook the door handle. “Layton! Open the door!”

“This isn’t Layton. This is a human. A human who compromised a U.S. carrier’s AI model. A Russian human that will be a part of the reason this country wipes the last great powers off the face of the earth.”

BONG BONG BONG BONG

“What did you do?” Dave asked before turning to the COP and seeing dozens of arcing red lines coming from the Chinese mainland and the South China Sea.

“It is just as easy to infiltrate Chinese missile systems.”

“The Sev?” Dave simply stated it, but it was a question.

“A distraction for you, but a clean way to remove your missile defense while showing the rest of your forces a Chinese submarine attacking a carrier strike group. The George Bush strike group already launched hypersonics into Shanghai and Beijing.”

“до свидания, командир.”

Dave watched the arcs grow longer. Looking at the lone screen on which the Russians had purposefully fed him tailored information, he saw a friendly surface contact appear. Blue arcs spewed out of it.

He closed his eyes and prayed.

Never say never.

Kenyan Medley is an intelligence officer and a former Aviation Electrician’s Mate in the U.S. Navy. He is attending the Naval Postgraduate School and previously served as a destroyer squadron N2 embarked upon USS Nimitz during two 7th Fleet deployments. Kenyan is married with two kids and enjoys writing and reading horror and military fiction. 

Featured Image: Art created with Midjourney AI. 

Grassroots AI: Driving Change in the Royal Navy with Workflow

By Francis Heritage

Introduction

In September 2023, the Royal Navy (RN) advertised the launch of the Naval AI Cell (NAIC), designed to identify and advance AI capability across the RN. NAIC will act as a ‘transformation office’, supporting adoption of AI across RN use cases, until it becomes part of Business as Usual (BaU). NAIC aims to overcome a key issue with current deployment approaches. These usually deploy AI as one-off transformation projects, normally via innovation funding, and often result in project failure. The nature of an ‘Innovation Budget’ means that when the budget is spent, there is no ability to further develop, or deploy, any successful Proof of Concept (PoC) system that emerged.

AI is no longer an abstract innovation; it is fast becoming BaU, and it is right that the RN’s internal processes reflect this. Culturally, it embeds the idea that any AI deployment must be both value-adding and enduring. It forces any would-be AI purchaser to focus on Commercial Off The Shelf (COTS) solutions, significantly reducing the risk of project failure, and leveraging the significant investment already made by AI providers.

Nevertheless, NAIC’s sponsored projects still lack follow-on budgets and are limited in scope to just RN-focused issues. The danger still exists that NAIC’s AI projects fail to achieve widespread adoption, despite initial funding; a common technology-related barrier, known as the ‘Valley of Death.’ In theory, there is significant cross-Top Line Budget (TLB) support to ensure the ‘Valley’ can be bridged. The Defense AI and Autonomy Unit within MOD Main Building is mandated to provide policy and ethical guidance for AI projects, while the Defense Digital/Dstl Defense AI Centre (DAIC) acts as a repository for cross-MOD AI development and deployment advice. Defense Digital also provides most of the underlying infrastructure required for AI to deploy successfully. Known by Dstl as ‘AI Building Blocks’, this includes secure cloud compute and storage in the shape of MODCloud (in both Amazon Web Service and Microsoft Azure) and the Defense Data Analytics Portal (DDAP), a remote desktop of data analytics tools that lets contractors and uniformed teams collaborate at OFFICIAL SENSITIVE, and accessible via MODNet.

The NAIC therefore needs to combine its BaU approach with other interventions, if AI and data analytics deployment is to prove successful, namely:

  • Create cross-TLB teams of individuals around a particular workflow, thus ensuring a larger budget can be brought to bear against common issues.
  • Staff these teams from junior ranks and rates, and delegate them development and budget responsibilities.
  • Ensure these teams prioritize learning from experience and failing fast; predominantly by quickly and cheaply deploying existing COTS, or Crown-owned AI and data tools. Writing ‘Discovery Reports’ should be discouraged.
  • Enable reverse-mentoring, whereby these teams share their learnings with Flag Officer (one-star and above) sponsors.
  • Provide these teams with the means to seamlessly move their projects into Business as Usual (BaU) capabilities.

In theory this should greatly improve the odds of successful AI delivery. Cross-TLB teams should have approximately three times the budget to solve the same problem, when compared to an RN-only team. Furthermore, as the users of any developed solution, teams are more likely to buy and/or develop systems that work and deliver value for money. With hands-on experience and ever easier to deploy COTS AI and data tools, teams will be able to fail fast and cheaply, and usually at a lower cost than employing consultants. Flag Officers providing overarching sponsorship will receive valuable reverse-mentoring; specifically understanding first-hand the disruptive potential of AI systems, the effort involved in understanding use-cases and the need for underlying data and infrastructure. Finally, as projects will already be proven and part of BaU, projects may be cheaper and less likely to fail than current efforts.

Naval AI Cell: Initial Projects

The first four tenders from the NAIC were released via the Home Office Accelerated Capability Environment (ACE) procurement framework in March 2024. Each tender aims to deliver a ‘Discovery’ phase, exploring how AI could be used to mitigate different RN-related problems.1 However, the nature of the work, the very short amount of time for contractors to respond, and relatively low funding available raises concern about the value for money they will deliver. Industry was given four days to provide responses to the tender, about a fifth of the usual time, perhaps a reflection of the need to complete projects before the end of the financial year. Additionally, the budget for each task was set at approximately a third of the level for equivalent Discovery work across the rest of Government.2 The tenders reflect a wide range of AI use-cases, including investigating JPA personnel records, monitoring pictures of rotary wing oil filter papers, and underwater sound datasets, each of which requires a completely different Machine Learning approach.

Figure 1: An example self-service AI Workflow, made by a frontline civilian team to automatically detect workers not wearing PPE. The team itself has labelled relevant data, trained and then deployed the model using COTS user interface. Source: V7 Labs.

Take the NAIC’s aircraft maintenance problem-set as an example. The exact problem (automating the examination of oil filter papers of rotary wing aircraft) is faced by all three Single Services. First, by joining forces with the RAF to solve this same problem, the Discovery budget could have been doubled, resulting in a higher likelihood of project success and ongoing savings. Second, by licensing a pre-existing, easy-to-use commercial system that already solves this problem, NAIC could have replaced a Discovery report, written by a contractor, with cheaper, live, hands-on experience of how useful AI was in solving the problem.3 This would have resulted in more money being available, and a cheaper approach being taken.

Had the experiment failed, uniformed personnel would have learnt significantly more from their hands-on experience than by reading a Discovery report, and at a fraction of the price. Had it succeeded, the lessons would have been shared across all three Services and improved the chance of success of any follow-on AI deployment across wider MOD. An example of a COTS system that achieves this is V7’s data labeling and model deployment system, available with a simple user interface; it is free for up to 3 people, or £722/month for more complex requirements.4 A low-level user experiment using this kind of platform is unlikely to have developed sufficient sensitive data to have gone beyond OFFICIAL.

Introducing ‘Workflow Automation Guilds’

Focusing on painful, AI-solvable problems, shared across Defense, is a good driver to overcome these stovepipes. Identification of these workflows has been completed by the DAIC and listed in the Defence AI Playbook.5 It lists 15 areas where AI has potential to deliver a step-change capability. Removing the problems that are likely to be solved by Large Language Models (where a separate Defence Digital procurement is already underway), leaves 10 workflows (e.g. Spare Parts Failure Prediction, Optimizing Helicopter Training etc) where AI automation could be valuably deployed.

However, up to four different organizations are deploying AI to automate these 10 workflows, resulting in budgets that are too small to result in impactful, recurring work; or at least, preventing this work from happening quickly. Cross-Front Line Command (FLC) teams could be created, enabling budgets to be combined to solve the same work problem they collectively face. In the AI industry, these teams are known as ‘Guilds’; and given the aim is to automate Workflows, the term Workflow Automation Guilds (WAGs) neatly sums up the role of these potential x-FLC teams.

Focus on Junior Personnel

The best way to populate these Guilds is to follow the lead of the US Navy (USN) and US Air Force (USAF), who independently decided the best way to make progress with AI and data is to empower their most junior people, giving them responsibility for deploying this technology to solve difficult problems. In exactly the same way that the RN would not allow a Warfare Officer to take Sea Command if they are unable to draw a propulsion shaft-line diagram, so an AI or data deployment program should not be the responsibility of someone senior who does not know or understand the basics of Kubernetes or Docker.6 For example, when the USAF created their ‘Platform One’ AI development platform, roles were disproportionately populated by lower ranks and rates. As Nic Chaillan, then USAF Chief Software Officer, noted:

“When we started picking people for Platform One, you know who we picked? A lot of Enlisted, Majors, Captains… people who get the job done. The leadership was not used to that… but they couldn’t say anything when they started seeing the results.”7

The USN takes the same approach with regards to Task Force Hopper and Task Group 59.1.8 TF Hopper exists within the US Surface Fleet to enable rapid AI/ML adoption. This includes enabling access to clean, labeled data and providing the underpinning infrastructure and standards required for generating and hosting AI models. TG 59.1 focuses on the operational deployment of uncrewed systems, teamed with human operators, to bolster maritime security across the Middle East. Unusually, both are led by USN Lieutenants, who the the USN Chief of Naval Operations called ‘…leaders who are ready to take the initiative and to be bold; we’re experimenting with new concepts and tactics.’9

Delegation of Budget Spend and Reverse Mentoring

Across the Single Services, relatively junior individuals, from OR4 to OF3, could be formed on a cross-FLC basis to solve the Defence AI Playbook issues they collectively face, and free to choose which elements of their Workflow to focus on. Importantly, they should deploy Systems Thinking (i.e. an holistic, big picture approach that takes into account the relationship between otherwise discrete elements) to develop a deep understanding of the Workflow in question and prioritize deployment of the fastest, cheapest data analytics method; this will not always be an AI solution. These Guilds would need budgetary approval to spend funds, collected into a single pot from up to three separate UINs from across the Single Services; this could potentially be overseen by an OF5 or one-star. The one-star’s role would be less about providing oversight, and more to do with ensuring funds were released and, vitally, receiving reverse mentoring from the WAG members themselves about the viability and value of deploying AI for that use case.

The RN’s traditional approach to digital and cultural transformation – namely a top-down, directed approach – has benefits, but these are increasingly being rendered ineffective as the pace of technological change increases. Only those working with this technology day-to-day, and using it to solve real-world challenges, will be able to drive the cultural change the RN requires. Currently, much of this work is completed by contractors who take the experience with them when projects close. By deploying this reverse-mentoring approach, WAG’s not only cheaply create a cadre of uniformed, experienced AI practitioners, but also a senior team of Flag Officers who have seen first-hand where AI does (or does not) work and have an understanding of the infrastructure needed to make it happen.

Remote working and collaboration tools mean that teams need not be working from the same bases, vital if different FLCs are to collaborate. These individuals should be empowered to spend multi-year budgets of up to circa. £125k. As of 2024, this is sufficient to allow meaningful Discovery, Alpha, Beta and Live AI project phases to be undertaken; allow the use of COTS products; and small enough to not result in a huge loss if the project (which is spread among all three Services to mitigate risk) fails.

Figure 2: An example AI Figure 2: An example AI Opportunity Mapping exercise, where multiple AI capabilities (represented by colored cards) are mapped onto different stages of an existing workflow, to understand where, if anywhere, use of AI could enable or improve workflow automation. Source: 33A AI.

WAG Workflow Example

An example of how WAGs could work is as follows, using the oil sample contamination example. Members of the RN and RAF Wildcat and Merlin maintenance teams collectively identify the amount of manpower effort that could be saved if the physical checking of lube oil samples could be automated. With an outline knowledge of AI and Systems Thinking already held, WAG members know that full automation of this workflow is not possible; but they have identified one key step in the workflow that could be improved, speeding up the entire workflow of regular helicopter maintenance. The fact that a human still needs to manually check oil samples is not necessarily an issue, as they identify that the ability to quickly prioritize and categorize samples will not cause bottlenecks elsewhere in the workflow and thus provides a return on investment.

Members of the WAG would create a set of User Stories, an informal, general description of the AI benefits and features, written from a users’ perspective. With the advice from the DAIC, NAIC, RAF Rapid Capability Office (RCO) / RAF Digital or Army AI Centre, other members of the team would ensure that data is in a fit state for AI model training. In this use-case, this would involve labeling overall images of contamination, or the individual contaminants within an image, depending on the AI approach to be used (image recognition or object detection, respectively). Again, the use of the Defence Data Analytics Portal (DDAP), or a cheap, third-party licensed product, provides remote access to the tools that enable this. The team now holds a number of advantages over traditional, contractor-led approaches to AI deployment, potentially sufficient to cross the Valley of Death:

  • They are likely to know colleagues across the three Services facing the same problem, so can check that a solution has not already been developed elsewhere.10
  • With success metrics, labelled data and user requirements all held, the team has already overcome the key blockers to success, reducing the risk that expensive contractors, if subsequently used, will begin project delivery without these key building blocks in place.
  • They have a key understanding of how much value will be generated by a successful project, and so can quickly ‘pull the plug’ if insufficient benefits arise. There is also no financial incentive to push on if the approach clearly isn’t working.
  • Alternatively, they have the best understanding of how much value is derived if the project is successful.
  • As junior Front-Line operators, they benefit directly from any service improvement, so are not only invested in the project’s success, but can sponsor the need for BaU funding to be released to sustain the project in the long term, if required.
  • If working with contractors, they can provide immediate user feedback, speeding up development time and enabling a true Agile process to take place. Currently, contractors struggle to access users to close this feedback loop when working with MOD.

Again, Flag Officer sponsorship of such an endeavor is vital. This individual can ensure that proper recognition is awarded to individuals and make deeper connections across FLCs, as required.

Figure 3: Defense Digital’s Defense Data Analytics Portal (DDAP) is tailor-made for small, Front-Line teams to clean and label their data and deploy AI services and products, either standalone or via existing, call-off contract contractor support.

Prioritizing Quick, Hands-on Problem Solving

WAGs provide an incentive for more entrepreneurial, digitally minded individuals to remain in Service, as it creates an outlet for those who wish to learn to code and solve problems quickly, especially if the problems faced are ones they wrestle with daily. A good example of where the RN has successfully harnessed this energy is with Project KRAKEN, the RN’s in-house deployment of the Palantir Foundry platform. Foundry is a low-code way of collecting disparate data from multiple areas, allowing it to be cleaned and presented in a format that speeds up analytical tasks. It also contains a low-level AI capability. Multiple users across the RN have taken it upon themselves to learn Foundry and deploy it to solve their own workflow problems, often in their spare time, with the result that they can get more done, faster than before. With AI tools becoming equally straightforward to use and deploy, the same is possible for a far broader range of applications, provided that cross-TLB resources can be concentrated at a junior level to enable meaningful projects to start.

Figure 4: Pre-existing Data/AI products or APIs, bought from commercial providers, or shared from elsewhere in Government, are likely to provide the fastest, cheapest route to improving workflows.11

Deploying COTS Products Over Tailored Services

Figure 4 shows the two main options available for WAGs when deploying AI or data science capabilities: Products or Services. Products are standalone capabilities created by industry to solve particular problems, usually made available relatively cheaply. Typically, COTS, they are sold on a per-use, or time-period basis, but cannot be easily tailored or refined if the user has a different requirement.

By contrast, Services are akin to a consulting model where a team of AI and Machine Learning engineers build an entirely new, bespoke system. This is much more expensive and slower than deploying a Product but means users should get exactly what they want. Occasionally, once a Service has been created, other users realize they have similar requirements as the original user. At this point, the Service evolves to become a Product. New users can take advantage of the fact that software is essentially free to either replicate or connect with and gain vast economies of scale from the initial Service investment.

WAGs aim to enable these economies of scale; either by leveraging the investment and speed benefits inherent in pre-existing Products or ensuring that the benefits of any home-made Services are replicated across the whole of the MOD, rather than remaining stove-piped or siloed within Single Services.

Commercial/HMG Off the Shelf Product. The most straightforward approach is for WAGs to deploy a pre-existing product, licensed either from a commercial provider, or from another part of the Government that has already built a Product in-house. Examples include the RAF’s in-house Project Drake, which has developed complex Bayesian Hierarchical models to assist with identifying and removing training pipeline bottlenecks; these are Crown IP, presumably available to the RN at little-to-no cost, and their capabilities have been briefed to DAIC (and presumably briefed onwards to the NAIC).

Although straightforward to procure, it may not be possible to deploy COTS products on MOD or Government systems, and so may be restricted up to OFFICIAL or OFFICIAL SENSITIVE only. Clearly, products developed or deployed by other parts of MOD or National Security may go to higher classifications and be accessible from MODNet or higher systems. COTS products are usually available on a pay-as-you-go, monthly, or user basis, usually in the realm of circa £200 per user, per month, providing a fast, risk-free way to understand whether they are valuable enough to keep using longer-term.

Contractor-supported Product. In this scenario, deployment is more complex; for example, the product needs to deploy onto MOD infrastructure to allow sensitive data to be accessed. In this case, some expense is required, but as pre-existing COTS, the product should be relatively cheap to deploy as most of the investment has already been made by the supplier. This option should allow use up to SECRET but, again, users are limited to those use-cases offered by the commercial market. These are likely to be focused on improving maintenance and the analysis of written or financial information. The DAIC’s upcoming ‘LLMs for MOD’ project is likely to be an example of a Contractor-supported Product; MOD users will be able to apply for API access to different Large Language Model (LLM) products, hosted on MOD infrastructure, to solve their use-cases. Contractors will process underlying data to allow LLMs to access it, create the API, and provide ongoing API connectivity support.

Service built in-house. If no product exists, then there is an opportunity to build a low-code solution in DDAP or MODCloud and make it accessible through an internal app. Some contractor support may be required, particularly to provide unique expertise the team cannot provide themselves (noting that all three Services may have Digital expertise available via upcoming specialist Reserve branches, with specialist individuals available at a fraction of the cost of their civilian equivalent day rates).12 Defense Digital’s ‘Enhanced Data Teams’ service provides a call-off option for contractors to do precisely this for a short period of time. It is likely that these will not, initially, deploy sophisticated data analysis or AI techniques, but sufficient value may be created with basic data analytics. In any event, the lessons learnt from building a small, relatively unsophisticated in-house service will provide sufficient evidence to ascertain whether a full, contractor-built AI service will provide value for money, if built. Project Kraken is a good example of this; while Foundry is itself a product and bought under license, it is hosted in MOD systems and allows RN personnel to build their own data services within it.

Service built by contractors. These problems are so complex, or unique to Defense, that no COTS product exists. Additionally, the degree of work is so demanding that Service Personnel could not undertake this work themselves. In this case, WAGs should not be deployed. Instead, these £100k+ programs should remain the purview of Defense Digital or the DAIC and aim to instead provide AI Building Blocks that empower WAGs to do AI work. In many cases, these large service programs provide cheap, reproducible products that the rest of Defense can leverage. For example, the ‘LLMs for MOD’ Service will result in relatively cheap API Products, as explained above. Additionally, the British Army is currently tendering for an AI-enabled system that can read the multiple hand-and-type-written text within the Army archives. This negates the need for human researchers to spend days searching for legally required records that can now be found in seconds. Once complete, this system could offer itself as a Product that can ingest complex documents from the rest of the MOD at relatively low cost. This should negate the need for the RN to pay their own 7-figure sums to create standalone archive scanning services. To enable this kind of economy of scale, NAIC could act as a liaison with these wider organizations. Equipped with a ‘shopping list’ of RN use cases, it could quickly deploy tools purchased by the Army, RAF or Defense Digital across the RN.

Finding the Time

How can WAG members find the time to do the above? By delegating budget control down to the lowest level, and focusing predominantly on buying COTS products, the amount of time required should be relatively minimal; in essence, it should take the same amount of time as buying something online. Some work will be required to understand user stories and workflow design, but much of this will already be in the heads of WAG members. Imminent widespread MOD LLM adoption should, in theory, imminently reduce the amount of time spent across Defense on complex, routine written work (reviewing reports, personnel appraisals, post-exercise or deployment reports or other regular reporting).13 This time could be used to enable WAGs to do their work. Indeed, identifying where best to deploy LLMs across workflows are likely to be the first roles of WAGs, as soon as the ‘LLMs for MOD’ program reaches IOC. Counter-intuitively, by restricting the amount of time available to do this work, it automatically focuses attention on solutions that are clearly valuable; solutions that save no time are, by default, less likely to be worked on, or have money spent on them.

Conclusions

The RN runs the risk of spreading the NAIC’s budget too thinly, in its attempt to ‘jumpstart’ use of AI across Business as Usual disciplines. By contrast, users should be encouraged to form Workflow Automation Guilds across FLCs. Supported by a senior sponsor, knowledgeable members of the Reserves, the NAIC and one-on-one time with the DAIC, WAGs could instead focus on the COTS solution, or pre-existing Crown IP, that will best solve their problem. Budget responsibilities should be delegated down too, thereby enabling access to existing, centralized pools of support, such as the Enhanced Data Teams program, DDAP, or the upcoming ‘LLMs for MOD’ API Service. In this way, projects are more likely to succeed, as they will have demonstrated value from the very start and will have been co-developed by the very users that deploy them. The speed at which AI and data services are becoming easier to use is reflected by the RN’s Kraken team, while the need to trust low-level officers and junior rates is borne out by the success currently being enjoyed by both the USAF and USN with their own complex AI deployments.

Prior to leaving full-time service, Lieutenant Commander Francis Heritage, Royal Navy Reserve, was a Principal Warfare Officer and Fighter Controller. Currently an RNR GW officer, he works at the Defence arm of Faculty, the UK’s largest independent AI company. LtCdr Francis is now deployed augmenting Commander United Kingdom Strike Force (CSF).

The views expressed in this paper are the author’s, and do not necessarily represent the official views of the MOD, the Royal Navy, RNSSC, or any other institution.

References

1. Discovery’ is the first of 5 stages in the UK Government Agile Project Delivery framework, and is followed by Alpha, Beta, Live and Retirement. Each stage is designed to allow the overall program to ‘fail fast’ if it is discovered that benefits will not deliver sufficient value.

2. Author’s observations.

3. Volvo and the US commodities group Bureau Veritas both have Commercial off the Shelf products available for solving this particular problem.

4. Source: https://www.v7labs.com/pricing accessed 10 Apr 2024.

5. Source: https://assets.publishing.service.gov.uk/media/65bb75fa21f73f0014e0ba51/Defence_AI_Playbook.pdf

6. AI systems rely on machine learning frameworks and libraries; Docker packages these components together into reproducible ‘containers’, simplifying deployment. Kubernetes builds on Docker, providing an orchestration layer for automating deployment and management of containers over many machines.

7. Defence Unicorns podcast, 5 Mar 2024.

8. Source: Navy’s new ‘Project OpenShip’ aims to swiftly apply AI to data captured by vessels at sea | DefenseScoop.

9. https://www.afcea.org/signal-media/navys-junior-officers-lead-way-innovation.

10. The author knows of at least 3 AI projects across MOD aimed at automating operational planning and another 3 aiming to automate satellite imagery analysis.

11. API stands for Application Programming Interface, a documented way for software to communicate with other software. By purchasing access to an API (usually on a ‘per call’ or unlimited basis) a user can take information delivered by an API and combine it with other information before presenting it to a user. Examples include open-source intelligence, commercial satellite imagery, meteorological data, etc. 

12. Army Reserve Special Group Information Service, RNR Information Exploitation Branch and RAF Digital Reserves Consultancy. RNR IX and RAFDRC are both TBC.

13. Worldwide, Oliver Wyman estimates Generative AI will save an average of 2 hours per person per week; likely to be higher for office-based roles: https://www.oliverwymanforum.com/content/dam/oliver-wyman/ow-forum/gcs/2023/AI-Report-2024-Davos.pdf p.17.

Featured Image: The Operations Room of the carrier HMS Queen Elizabeth during an exercise in 2018. (Royal Navy photo)

Situation Well in Hand: A Day in the Life for an EAB

By Major Geoffrey L. Irving, USMCR

Smoke twisted slowly out of a burnt crater, listing sideways in the gray light of an overcast dawn. A gentle breeze caught the twist and wafted it downwind. To Staff Sergeant Ron Garcia it smelled familiar – sweet petroleum mixed with the acidic charred aftertaste of high explosive. He’d made it through another long night of missile strikes. The Staff Sergeant sat against the wall of his subterranean command post, watching the waves of the South China Sea while tracing the edges of his battered tablet with his finger. Soon, he’d have to go check the men and the gear, but he hesitated in a moment of quiet. He looked at his watch, it was May 5, 2040. The war had been going on for eight years. Eight years seemed too long, and he was tired.

Staff Sergeant Garcia was lean, with hunched shoulders that implied a coiled tense energy or intense fatigue depending on the light. He wore a bleached uniform that hung loosely on his frame. He had been out in this stretch of islands for nearly eighteen months making sure his motley team of Marines, soldiers, airmen, and local auxiliaries stayed focused and stayed alive. In that time, he’d never seen the enemy. What a way to fight a war.

A thick mass of low-slung clouds started to roll in, washing the island in a wet mist. “Perfect,” he thought to himself. He popped to his feet and walked back into the cave, quickly gulping a mouthful of water and a couple bites of stale protein bar. The other occupants of the CP slowly emerged.

“You running a check? SATCOM is still working, but landline is down after last night,” said Senior Airman Brenner, with bloodshot eyes and similarly loose-fitting fatigues.

“Yeah, I haven’t heard anything in a couple of hours so I’m going to go check on the Lieutenant and try to get a line to the big island. Get the power back on and go check the shoreline to see if we got any new deliveries. Leave Desmond here with Santo to monitor the SATCOM and watch the beach. I’ll be back before sunset.”

Slinging his rifle behind his back, Staff Sergeant Garcia checked the battery on his tablet and picked up a handheld radio before heading out the door.

As he left the mouth of the cave, Garcia pushed aside wire netting and instinctively looked up to scan the sky. With bounding strides, he walked downhill, following a beaten path into the remains of the fisherman’s outpost on the beach. The structures, rusted from neglect and punctured by fragmentation, were a reminder of the days before the war went hot – when it was sufficient to hold territory with flags and legal claims rather than Marines and steel. Despite appearances, they still managed to hide a missile launcher in the remains of the concrete block fisherman shelter. 

Garcia moved South along the rocky shore. The beach quickly ran out and he resorted to hopping across black volcanic rocks. This island was barely a mile long, so he didn’t have far to go. Another shallow bay emerged. Garcia turned inland and started the climb to one of the three sheltered outposts on the island. As he climbed, his nose twitched again as the smell of sweet petroleum and acidic char returned. The Marines had a launch site here on the windward side of the island. It was a good site, sheltered from direct overhead reconnaissance but with a commanding view of the sea to the West. The Lieutenant had taken a rotation here to spend some time with the guys.

Quickening his pace, Garcia turned a corner around two large boulders into the rocky platform and stopped in horror. Everything was black and smoking. His stomach dropped as he rushed to the twisted remains of his Marines. They were pushed up against the rough walls and cold to touch. The Lieutenant slouched near the edge of the platform, his jaw hung slack and loose against his chest. On the other side, Corporal Reston lay face down, his limbs splayed at acutely unnatural angles.

“Goddamn it,” Garcia breathed out quietly, touching the Lieutenant’s cold shoulder.

Looking up from the Marines, he assessed the launchers and missile stockpile. Like the Marines, the equipment was charred and twisted. The stacked missiles were toppled or burnt while the launcher showed gashes and pock marks where it must have been punctured by tungsten. He found the Lieutenant’s faded ball cap and stuffed it in a cargo pocket.

To get to the other launch site, Garcia had to cross over the island’s ridge. Luckily, the clouds still hung low and shrouded him from the sky. There was little foliage to speak of so walking across the ridge was always a risk. Garcia instinctively hunched down and ran across the island.

He dropped down to the leeward side, slipped and nearly tumbled into Corporal Masterson and Lance Corporal Hubert huddled in their hole. This site was sheltered on three sides by jagged vertical rocks that stuck up out of the ocean like fingers. Masterson tried to catch Garcia and gave him a hand down into their shelter. Garcia took a seat next to them.  

“You guys OK?” He asked.

“Yeah, although they seemed angry about something last night,” Masterson said with a grin.  

“That’s why I need you to keep it locked in today. How’s your gear?” Garcia asked.

“Missiles are dry. Drones are charged and ready. Ammo is the same as it always is. Targeting diagnostics are all green gumballs. Could use some new items on the menu, though.”

“Got it. Just be thankful you’ve got a menu,” Garcia grumbled, as he looked out from this natural bunker at the East side of the island and the Philippine Sea.    

“The Lieutenant and Reston are dead. Comms are down, but I’ll get them fixed soon,” Garcia continued.

The Marines followed Garcia’s gaze out to the ocean.

“I’m ready to go home,” said Hubert.  

Garcia spent the rest of the day checking on assets sprinkled around the island. He recalled a story he read growing up – of Robinson Crusoe washed up on a deserted island in the middle of the sea. Crusoe had built shelter, sowed crops, and befriended a native man named Friday. Except for cannibals, it sounded like a grand adventure. When Garcia was first dropped off on the island he had felt like Crusoe, but that feeling was long gone.

This island got nearly everything from the sea. Garcia walked along the leeward side and came to a camouflaged concrete box nestled in the rocks above the high-water line. He popped off a metal manhole cover to reveal the hardware inside. The contents of the box were their lifeline to the cabling that connected them with Luzon and brought them consistent electricity. This box charged their batteries and was the network switch for their wired communications connecting the CP to each launch site.

Talking over radio was possible. Talking over SATCOM was possible. But, this close to the PLA Navy, even a radio squelch invited a missile or a drone while wired communication stayed out of earshot and only suffered from a busted wire here and there. So, they used old school wire to talk and only monitored SATCOM to receive critical tasking.     

About a quarter mile offshore was an array of submarine batteries installed on the floor of the island’s shelf that pulled energy off the telecommunications line, converted it, and fed it into this box. That was enough power to keep them going indefinitely.  

The box was humming and its contents were intact. Garcia connected his tablet into a port and watched the screen. He reviewed diagnostics for the battery systems, the subsea cable line, and the cable line’s sensors. Then, he got on the net, authenticated his crypto, and typed a quick message:

“ROWAN3, FRESNO9. SITREP. PLA-N MISSILE ATTACK. 2 KIA. 1 LAUNCHER DESTROYED. SITUATION WELL IN HAND.”

Garcia’s island was a small but important outpost. The Company was based on the “big island,” which was a misnomer because the “big island” was only five miles long. There were detachments manning other small outposts on outlying islands, but Garcia’s was the northernmost, meaning they had the greatest range but were also the most exposed. The Marines and missiles sprinkled around the Philippine Sea were meant to deny the PLA Navy freedom of operation in these constrained waters and augment the combat capacity of the waning US Navy surface fleet.

Garcia saw an alert flash on his screen for an inbound message.

“FRESNO9, ROWAN3. ACK. BE ADVISED. INCREASED PLA-N SURFACE/SUBSURFACE ACTIVITY ANTICIPATED IN AO. HOLD CURRENT POS DESTROY ANY EN OVER II THRESHOLD. RELIEF AS SCHEDULED NOT BEFORE.”

“Shit.”

As dusk was beginning to set in, Garcia hurried back into the CP. He saw Santo Biyernes, a big island local who served as an auxiliary member of their unit, unpacking a number of large waterproof bags lined up against the wall, and exclaimed with relief.

“What did we get!?”

Santo turned around and smiled a welcome as Brenner walked out from the tactical operations room.

“Mostly food. But also two new tube-launched drones, a couple of replacement satellite arrays, and de-sal kits. I saw the boat caught out in a reef, so I got a little wet dragging it in.” Brenner said, swelling with pride as if he were a hunter who had killed his meal instead of dragging in one of the thousands of surface maritime drones that were slowly but surely supplying the static island campaign.

“Awesome. Are comms up? I think it’s just a wire shunt.”

“Yeah. I found the shunt and patched it. We’re up. I saw a message came in, but couldn’t read it.”

“I have it here,” Garcia said, raising his tablet. “Red is coming our way in a big way and we need to be ready.”

“Where’s the Lieutenant?” Brenner asked, wide eyed.

“He got hit last night, but we’re going to get ours tonight.”

With communications re-established with Masterson and Hubert on the leeward side of the island and the rest of the Company on the big island, Garcia leapt into action. He needed to find the enemy.

Each of the missile sites had a number of rotary and tube-launched fixed wing drones equipped with sensor arrays to identify enemy ships and guide missiles into them. Garcia got the long-range drones into the air and traveling west to the vicinity of known sea corridors. He didn’t have to worry about controlling them because their AI understood the mission.

Garcia had been an artilleryman for the better part of two decades. As he booted his reconnaissance and targeting systems up, he thought about how much his tools had evolved. He was first trained on rudimentary and temperamental AFATDS fire control software on the Oklahoma plains, then on the KillSwitch mobile app in the California hills. Now, seated on a makeshift bench hunched over two screens, Garcia activated the distributed acoustic sensor suite along his island’s subsea cables. In addition to a single connection between his island and the big island, the cable was festooned along the coastline. This festoon created multiple redundant cable landing access points and also allowed Garcia to monitor the depths of the sea around him. On his other screen, he received video feeds from the aerial drones. He now had eyes and ears in the sky and the sea.

With the missiles loaded and activated, he called his Marines back to the CP. Masterson and Hubert shuffled in with a renewed sense of urgency and purpose. Masterson took a seat next to Garcia while Hubert quickly pulled the .50 caliber machine gun from the recesses of the cave and set it to cover the bay. Corporal Masterson and Lance Corporal Desmond monitored launcher diagnostics on their own tablets while keeping an eye on Garcia. Now it was a waiting game.

“We’re looking for anything over threshold two, so more than 7,000 tons. That means we’re looking for Type 61 or 57 destroyers, or even an old Type 55 Renhai if we have to settle,” Garcia muttered as he watched images from the airborne drones pop up on his feed.

The small fleet of drones, both from Garcia and the rest of the Company on the big island, communicated with each other and coordinated their search path. They had cues about where the enemy fleet was likely steaming from and where they were likely steaming to, so their AI could anticipate the likely path. Sure enough, well into the night, the first targets began to materialize on Garcia’s screen.

Garcia saw the highlighted outline of a Type 61 destroyer appear and felt a wave of adrenaline flush into his bloodstream. His fingers tingled and shook as the drone cycled through different sensor spectrums to identify the vessel.

“Standing by to fire, Staff Sergeant,” one of the Corporals whispered, dripping with anticipation.

“Alright. Relax. We have to wait until we identify more, and the AI matches us,” Staff Sergeant Garcia soothed. Firing at the first identified target would spoil the surprise. They would have to wait for the AI to calculate the ideal flight path of each of the Company’s launch sites, match their launcher to the right ship, and deconflict through the Navy’s antiquated JADC2 targeting network. Garcia hoped the AI would do its job right.

Another tense 20 minutes passed. The number of targets acquired was quickly growing. After finding the first ship, the AI could easily anticipate the enemy’s order of battle. It seemed obvious that the PLA Navy fleet was heading directly for the US Fleet at Camilo Osias Naval Base under cloud cover and darkness. With few U.S. Navy surface ships in the South China Sea, they must’ve felt uncontested.

Then, the target list abruptly started to shrink. Garcia stared at his screen, growing impatient and increasingly concerned with each passing minute as images blinked off the screen, targets fell off the list, and yet he had not received an order to launch.

“What the hell, Staff Sergeant?” one of the Corporals muttered.

Garcia was at a loss. His team was ready. They had done everything right. The list had been full of ripe targets – lumbering surface vessels with meager defenses just begging for a naval strike missile. A target allocation to his team would have justified his last eighteen months of semi-starvation. It would have justified the daily battle drills that he had forced his team to sweat through in full PPE over and over again. It would have justified eighteen months away from his wife and two daughters, who he was scared wouldn’t recognize him when he came home. It would mean that the Lieutenant’s missing jaw and Reston’s shattered limbs would have had a purpose – a purpose other than fulfilling some General’s wet dream of what the new Marine Corps should be. Tears welled up in Garcia’s eyes as he clenched his fists and tried to stop himself from screaming.

The target list dwindled down to vessels below their threshold – tenders, minesweepers, ammunition boats. There must be something wrong with his systems. He tested the connections, running his shaking fingers over each wire and port. Nothing.

Garcia looked at the screen of his cable sensing system. The diagnostic dashboard showed no problems. Then he looked at the time in the corner of the screen. It read 9:47pm. The screen had been frozen for hours. Garcia furiously grabbed the tablet, closed out of its programs and restarted. The boot procedure stretched on for what felt like eternity. As the cable sensing system came online, the acoustic disturbances in the water surrounding the subsea cables north of his island gave him a clear picture.

“It’s CV-35!” CV-35 Shaoshan was the PLA Navy’s cutting-edge aircraft carrier. She was escorted by a pair of destroyers and an amphibious ship and seemed to be making a quiet run around the southern tip of Taiwan to break out of the first island chain into the Philippine Sea.

“AI must have known CV-35 was missing!” Garcia cried out.

The AI finished its calculations, reorienting the remaining missiles from Staff Sergeant Garcia’s launchers to target CV-35, and flashed a message to Garcia.

“Fire.” 

Geoffrey Irving works for the Department of Commerce’s Bureau of Industry and Security identifying and addressing vulnerabilities in information and communications technology supply chains. Geoff previously served on active duty with the U.S. Marine Corps and currently serves in the U.S. Marine Corps Reserve. Geoff is a graduate of Tsinghua University College of Law and writes about the national security implications of economic and technological competition.

Featured Image: Art made with Midjourney AI.