By: LT P.J. Greenbaum and LT Vince Freschi
Introduction
Operational availability is the Nuclear Navy’s bread and butter, yet shipboard technicians are currently prevented from improving maintenance outcomes by an archaic data bottleneck. Scheduled maintenance to prevent system degradation or failure is planned years in advance and kept up-to-date utilizing detailed records boards which meticulously track maintenance completion. When a system fails or is degraded, corrective maintenance often requires hours of reading, symptom elaboration, and troubleshooting, before a component can be repaired. Fatigued watchstanders burn critical man-hours sifting through static technical libraries and disjointed databases to isolate casualty root causes – a complex, time consuming process akin to utilizing a card catalog to search tens of thousands of pages for a single sentence.
Consider an MMN2 (Machinist’s Mate, Nuclear) conducting a troubleshoot and repair of a noisy coolant pump. Currently, this Sailor must spend hours manually cross-referencing decades worth of material history logs with historical maintenance records, send manually-collected vibration analysis data off-ship for analysis by pricey contractors, and flip through multiple volumes of tech manuals, all to diagnose a possible problem that may or may not lie at the root cause of the issue. Generating a quality control package to repair the pump takes several hours, and generating the work authorizations and safety protocols (i.e. “tag-outs”) takes several more – all of which delays the time to repair for the pump, leaving critical propulsion plant components offline longer than needed. From the authors’ experience leading Sailors in the maintenance of nuclear systems, finding the right part alone can take hours, with many times a successful search becoming futile due to the part number turning obsolete. This results in an asset unavailable for tasking, reducing the operational force posture.
The solution to this issue is to incorporate existing technology to equip Sailors with the tools necessary to keep ships in the fight. Fortunately, the technology exists to provide locally hosted, air-gapped Small Language Models (SLMs) to enhance the nuclear propulsion plant troubleshooting process. By implementing a Retrieval-Augmented Generation (RAG) framework, Naval Reactors can transform its massive repositories of procedures, technical manuals, and material history into a live, up-to-date database of Reliable External Knowledge (REK). This approach will reduce manhours required for fault investigation, improve the accuracy of system troubleshooting, and result in reduced downtime for propulsion plant components.
The Structure: Coupling Small-Language Models with Retrieval Augmented Generation
Much attention has been garnered since the release of ChatGPT in late 2022 regarding the use cases for large language models (LLMs) – models that possess between 100 billion and 2 trillion parameters – which are able to digest large volumes of disaggregated data and generate human-like text in response.1 While extremely powerful in their ability to synthesize and process loosely structured datasets, these machines require a high bandwidth connection to an off-ship data center – a non-starter for warships that frequently operate at EMCON in signal-denied environments. Even the Department of War’s release of GenAI.mil – a well-intentioned attempt at bringing generative AI capabilities directly to the warfighter – has limited applicability on a warship, where bandwidth and EMCON restrictions prevent its widespread use. Furthermore, LLMs can also be prone to “hallucinating” – irrelevant, incorrect, or misleading responses to queries – with potentially catastrophic consequences in a nuclear propulsion plant.2
The development and refinement of small language models (SLMs) possessing orders of magnitude fewer parameters (1 billion to 10 billion), however, allows for highly specialized, localized air-gapped models, capable of running on a single, high-functioning workstation – like a high-performance gaming laptop – which would be perfect for a rugged shipboard environment.3 These small-parameter SLMs would be most effective when coupled with a Retrieval-Augmented Generation (RAG) framework.4 This architecture replaces the model’s reliance on static, pre-trained memory with a dynamic system.5 By utilizing a local vector database to index every page of the ship’s electronic technical manuals – Reactor Plant Manuals (RPMs), Steam Plant Manuals (SPMs), machine-specific technical manuals, maintenance logs, and material history entries – the SLM no longer has to recall every answer from its training. All it has to do is find the answer within the verified, pre-uploaded documentation. RAG will enable it to return responses on the semantic meaning of queries and not the word-by-word match currently available with a “CTRL+F” search. This shift ensures that even a “small” 10 billion parameter model could deliver hull-specific, nuclear-grade technical guidance with a level of accuracy that matches or even exceeds its larger, data center-bound counterparts, all without the danger of hallucinations.6
The Nuclear Reliable External Knowledge Database
Key to the development of the vector database will be compiling and maintaining – with proper version control – a database of reliable external knowledge (REK) that the SLM can draw from when responding to Sailor queries. The widespread use of interactive electronic technical manuals (IETMs) like the Reactor Plant Manual and Steam Plant Manual onboard nuclear powered naval vessels simplifies the process of indexing the Nuclear Navy’s relevant data into a machine-readable format, but a machine which sees and understands procedural context – via the IETMs – without a thorough understanding of the system design bases – present in component technical manuals and procedural front-matter – runs the risk of providing inaccurate and incomplete recommendations. Therefore, in addition to the IETMs, legacy manuals for all propulsion plant components – like pumps and valves – as well as procedural guidelines – like the Joint Force Maintenance Manual (JFMM), the Radiological Controls for Ships, and the propulsion plant preventative maintenance system (PMS) – must be included in the REK database as well, allowing SLM recommendations to be pre-filtered through Nuclear Navy maintenance rules and regulations prior to arrival at the technician.
But the REK database cannot be static; it must be periodically updated to incorporate the latest feedback reports, Naval Reactors technical bulletins, and CASREPs , ensuring that the SLM maintains current hull and platform specific information. A monthly maintenance check, accomplished by uploading a “refresh” library – centrally created and made available via the Naval Reactors local area network – would allow for periodic updates in all but the strictest EMCON conditions. A critical advantage of incorporating fleet‑wide data is that equipment casualties encountered on one hull are often repeat iterations of known failure modes across other ships‑in‑class. By enabling Sailors to leverage prior diagnoses and corrective actions on sister platforms, the system reduces redundant troubleshooting, supports failure‑rate trending, and ultimately shortens equipment downtime.
Onboard the modern Ford-class engineering plant, the pre-existing automation and smart sensors open the doors to even more AI applications. Integrating the SLM with data log sets, vibrational analysis data, and material history, would allow the plant to be not only monitored through the eyes of its highly trained nuclear operators, but also through AI powered by the latest Silicon Valley advances. Coupling the SLM with Eulerian Video Magnification and installing fixed cameras on vital pumps and turbines would allow for early detection of failure on a minute-by-minute basis. The efficacy of preventative maintenance can be audited based on failure frequency and down time, empowering Sailors to continuously preserve the plant in the most efficient manner possible.
The Use Case
Consider once again the MMN2 from the introduction and the noisy reactor coolant pump. With a shipboard SLM co-pilot, the MMN2 could input the symptoms seen in a natural language prompt, accompanied by the equipment logs that preceded the failure along with the vibration data analysis for the pump. After a few seconds of “thinking,” the SLM would utilize its RAG framework to instantly pull the relevant drawings, scrape the log data for trends, and analyze the vibration data for potential failure mechanisms. But utilizing its live-updated REK library, the SLM could also flag CASREPs from other ships-in-class which noted similar failure modes. After a few more seconds, a quality assurance package could be generated in the ship-specific format, complete with all tools, parts, and materials required, required isolations for the work to be conducted, a step-by-step procedure for repair drawn from the relevant tech manual. But perhaps most importantly, each reference the SLM pulls its material from can be displayed on an adjacent window, with the MMN2 checking the SLMs recommendations each cited tech manual.
The Way Forward
The first step in deploying a shipboard RAG-equipped SLM at-scale must be accelerating the conversion of legacy technical manuals and drawings into machine-readable formats suitable for the creation of a vector database. Once all relevant technical publications, procedures, and best-practices have been compiled into a singular database (no small feat considering the Navy’s notorious compartmentalization), the data must be “chunked” into smaller sub-sections and embedded into a high-dimensional (due to computing restrictions, likely in the low hundreds of dimensions) numerical vector for retrieval by the model. Ensuring the model remains free of “data poisoning” – the injection of corrupted or incorrect data into a model’s training – will require a centralized organization, like Naval Reactors, to manage the training and development of the vector database. For optimal utility, shipboard technicians must be able to add data – like material history entries – into the vector database, although care must be taken to ensure that the model does not incorporate this potentially flawed shipboard data into its training phase. Fortunately, the recent establishment of the Propulsion Plant Local Area Network (PPLAN) technician school provides an avenue to train designated technicians on the procedures required to maintain the REK database. Designating these Sailors with a Naval Enlisted Classification (NEC) and designating those NECs as “critical” for billet-based distribution would ensure that ships have the requisite knowledge to always support their shipboard AI nodes onboard.
Next, the Navy must define the technical hardware specifications for a “Shipboard AI Node” that meets MIL-SPEC requirements for shock, vibration, and EMCON security. The Navy must prioritize maximizing video random access memory (VRAM) to maximize the processing power (the “intelligence”) of the SLM. Utilizing a laptop client already approved to handle classified information would save time and speed delivery of the system to the fleet.
Once the software and client have been paired together, the Nuclear Navy can utilize its already-existing training pipeline to begin integrating the system into maintenance and troubleshooting workflows. The Nuclear Power Training Units in Charleston, South Carolina and Ballston Spa, New York function as the perfect “proving ground” for the SLM in a fleet-like scenario. Treating the prototype rollout as a dynamic test – experimenting with model size, tokenization, chunking, and retrieval methods – would improve model performance and validate system functionality prior to fleet-wide deployment.
Lastly, and perhaps most importantly, Naval Reactors must establish a “Human-in-the-Loop” certification framework, ensuring that AI-assisted troubleshooting remains an advisory tool that works within the rigorous standards of the Nuclear Navy. Just as the SWO community still teaches its navigators paper charting at SURFNAV, despite the existence of well-proven Voyage Management Systems (VMS), the Nuclear Navy must work to ensure that any AI troubleshooting co-pilot utilized by its Sailors enhances their understanding of integrated plant operations, not replaces it. While some may argue that using such a co-pilot bypasses the deep-dive traditionally required to build system-wide expertise, a shipboard SLM acts as a learning accelerant, not as a crutch. By removing manual document searches, the tool allows technicians to spend more time synthesizing plant information and conducting high-level analysis – tasks that truly build systemic understanding – as opposed to spending hours searching through manuals for the “NIIN in a haystack.”. When integrated into a “Human-in-the-Loop” framework, this technology ensures that a Sailor’s learning is grounded in the most accurate, cited technical data available. The stakes in nuclear reactor plant operations are simply too high to outsource critical thinking. This technology must be viewed as a force multiplier that sharpens a nuclear technician’s judgment and reinforces the culture of procedural compliance that defines the program.
Conclusion
The integration of SLMs equipped with RAG architecture into the nuclear propulsion environment represents a significant upgrade to the maintenance capabilities of the individual nuclear operators. Incorporating AI into the shipboard troubleshooting and maintenance workflow is a fundamental shift designed to reduce equipment downtimes. In a future conflict characterized by denied communications and long stints at sea, a ship’s ability to remain self-sufficient – by diagnosing and repairing its primary propulsion and electrical generation capacity without racing back to stateside contractors – could mean the difference between sustained operability and taking a capital ship out of the fight. Ultimately, the goal is clear: a propulsion plant where easily accessible, machine-readable data works as hard as the Sailor. The technology exists already; the Nuclear Navy’s leadership must just deploy it. By embracing SLMs and RAG architecture, the Navy’s most valuable and complex nuclear assets will remain mission-ready, even in the most contested environments.
References
- “A Comprehensive Survey of Small Language Models in the Era of Large Language Models: Techniques, Enhancements, Applications, Collaboration with LLMs, and Trustworthiness | ACM Transactions on Intelligent Systems and Technology.” Accessed December 21, 2025. https://dl.acm.org/doi/full/10.1145/3768165.
- Agrawal, Prof. Pallavi. “Running LLMs Locally on Consumer Devices.” International Journal for Research in Applied Science and Engineering Technology 13, no. 4 (2025): 5433–41. https://doi.org/10.22214/ijraset.2025.69433.
- Kandala, Savitha Viswanadh, Pramuka Medaranga, and Ambuj Varshney. “TinyLLM: A Framework for Training and Deploying Language Models at the Edge Computers.” arXiv:2412.15304. Preprint, arXiv, December 19, 2024. https://doi.org/10.48550/arXiv.2412.15304.
- Lewis, Patrick, Ethan Perez, Aleksandra Piktus, et al. “Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks.” arXiv:2005.11401. Preprint, arXiv, April 12, 2021. https://doi.org/10.48550/arXiv.2005.11401.
- Ruiz, Daniel C., and John Sell. “Fine-Tuning and Evaluating Open-Source Large Language Models for the Army Domain.” arXiv:2410.20297. Preprint, arXiv, October 27, 2024. https://doi.org/10.48550/arXiv.2410.20297.
- Shuster, Kurt, Spencer Poff, Moya Chen, Douwe Kiela, and Jason Weston. “Retrieval Augmentation Reduces Hallucination in Conversation.” arXiv:2104.07567. Preprint, arXiv, April 15, 2021. https://doi.org/10.48550/arXiv.2104.07567.
Featured Image: Sailors make repairs aboard the destroyer USS Halsey in the Arabian Sea in 2021.
Courtesy of Stripes.com



