Category Archives: Future Tech

What is coming down the pipe in naval and maritime technology?

Grassroots AI: Driving Change in the Royal Navy with Workflow

October 23, 2024 Guest Author Leave a comment

By Francis Heritage

Introduction

In September 2023, the Royal Navy (RN) advertised the launch of the Naval AI Cell (NAIC), designed to identify and advance AI capability across the RN. NAIC will act as a ‘transformation office’, supporting adoption of AI across RN use cases, until it becomes part of Business as Usual (BaU). NAIC aims to overcome a key issue with current deployment approaches. These usually deploy AI as one-off transformation projects, normally via innovation funding, and often result in project failure. The nature of an ‘Innovation Budget’ means that when the budget is spent, there is no ability to further develop, or deploy, any successful Proof of Concept (PoC) system that emerged.

AI is no longer an abstract innovation; it is fast becoming BaU, and it is right that the RN’s internal processes reflect this. Culturally, it embeds the idea that any AI deployment must be both value-adding and enduring. It forces any would-be AI purchaser to focus on Commercial Off The Shelf (COTS) solutions, significantly reducing the risk of project failure, and leveraging the significant investment already made by AI providers.

Nevertheless, NAIC’s sponsored projects still lack follow-on budgets and are limited in scope to just RN-focused issues. The danger still exists that NAIC’s AI projects fail to achieve widespread adoption, despite initial funding; a common technology-related barrier, known as the ‘Valley of Death.’ In theory, there is significant cross-Top Line Budget (TLB) support to ensure the ‘Valley’ can be bridged. The Defense AI and Autonomy Unit within MOD Main Building is mandated to provide policy and ethical guidance for AI projects, while the Defense Digital/Dstl Defense AI Centre (DAIC) acts as a repository for cross-MOD AI development and deployment advice. Defense Digital also provides most of the underlying infrastructure required for AI to deploy successfully. Known by Dstl as ‘AI Building Blocks’, this includes secure cloud compute and storage in the shape of MODCloud (in both Amazon Web Service and Microsoft Azure) and the Defense Data Analytics Portal (DDAP), a remote desktop of data analytics tools that lets contractors and uniformed teams collaborate at OFFICIAL SENSITIVE, and accessible via MODNet.

The NAIC therefore needs to combine its BaU approach with other interventions, if AI and data analytics deployment is to prove successful, namely:

Create cross-TLB teams of individuals around a particular workflow, thus ensuring a larger budget can be brought to bear against common issues.
Staff these teams from junior ranks and rates, and delegate them development and budget responsibilities.
Ensure these teams prioritize learning from experience and failing fast; predominantly by quickly and cheaply deploying existing COTS, or Crown-owned AI and data tools. Writing ‘Discovery Reports’ should be discouraged.
Enable reverse-mentoring, whereby these teams share their learnings with Flag Officer (one-star and above) sponsors.
Provide these teams with the means to seamlessly move their projects into Business as Usual (BaU) capabilities.

In theory this should greatly improve the odds of successful AI delivery. Cross-TLB teams should have approximately three times the budget to solve the same problem, when compared to an RN-only team. Furthermore, as the users of any developed solution, teams are more likely to buy and/or develop systems that work and deliver value for money. With hands-on experience and ever easier to deploy COTS AI and data tools, teams will be able to fail fast and cheaply, and usually at a lower cost than employing consultants. Flag Officers providing overarching sponsorship will receive valuable reverse-mentoring; specifically understanding first-hand the disruptive potential of AI systems, the effort involved in understanding use-cases and the need for underlying data and infrastructure. Finally, as projects will already be proven and part of BaU, projects may be cheaper and less likely to fail than current efforts.

Naval AI Cell: Initial Projects

The first four tenders from the NAIC were released via the Home Office Accelerated Capability Environment (ACE) procurement framework in March 2024. Each tender aims to deliver a ‘Discovery’ phase, exploring how AI could be used to mitigate different RN-related problems.¹ However, the nature of the work, the very short amount of time for contractors to respond, and relatively low funding available raises concern about the value for money they will deliver. Industry was given four days to provide responses to the tender, about a fifth of the usual time, perhaps a reflection of the need to complete projects before the end of the financial year. Additionally, the budget for each task was set at approximately a third of the level for equivalent Discovery work across the rest of Government.² The tenders reflect a wide range of AI use-cases, including investigating JPA personnel records, monitoring pictures of rotary wing oil filter papers, and underwater sound datasets, each of which requires a completely different Machine Learning approach.

Figure 1: An example self-service AI Workflow, made by a frontline civilian team to automatically detect workers not wearing PPE. The team itself has labelled relevant data, trained and then deployed the model using COTS user interface. Source: V7 Labs.

Take the NAIC’s aircraft maintenance problem-set as an example. The exact problem (automating the examination of oil filter papers of rotary wing aircraft) is faced by all three Single Services. First, by joining forces with the RAF to solve this same problem, the Discovery budget could have been doubled, resulting in a higher likelihood of project success and ongoing savings. Second, by licensing a pre-existing, easy-to-use commercial system that already solves this problem, NAIC could have replaced a Discovery report, written by a contractor, with cheaper, live, hands-on experience of how useful AI was in solving the problem.³ This would have resulted in more money being available, and a cheaper approach being taken.

Had the experiment failed, uniformed personnel would have learnt significantly more from their hands-on experience than by reading a Discovery report, and at a fraction of the price. Had it succeeded, the lessons would have been shared across all three Services and improved the chance of success of any follow-on AI deployment across wider MOD. An example of a COTS system that achieves this is V7’s data labeling and model deployment system, available with a simple user interface; it is free for up to 3 people, or £722/month for more complex requirements.⁴ A low-level user experiment using this kind of platform is unlikely to have developed sufficient sensitive data to have gone beyond OFFICIAL.

Introducing ‘Workflow Automation Guilds’

Focusing on painful, AI-solvable problems, shared across Defense, is a good driver to overcome these stovepipes. Identification of these workflows has been completed by the DAIC and listed in the Defence AI Playbook.⁵ It lists 15 areas where AI has potential to deliver a step-change capability. Removing the problems that are likely to be solved by Large Language Models (where a separate Defence Digital procurement is already underway), leaves 10 workflows (e.g. Spare Parts Failure Prediction, Optimizing Helicopter Training etc) where AI automation could be valuably deployed.

However, up to four different organizations are deploying AI to automate these 10 workflows, resulting in budgets that are too small to result in impactful, recurring work; or at least, preventing this work from happening quickly. Cross-Front Line Command (FLC) teams could be created, enabling budgets to be combined to solve the same work problem they collectively face. In the AI industry, these teams are known as ‘Guilds’; and given the aim is to automate Workflows, the term Workflow Automation Guilds (WAGs) neatly sums up the role of these potential x-FLC teams.

Focus on Junior Personnel

The best way to populate these Guilds is to follow the lead of the US Navy (USN) and US Air Force (USAF), who independently decided the best way to make progress with AI and data is to empower their most junior people, giving them responsibility for deploying this technology to solve difficult problems. In exactly the same way that the RN would not allow a Warfare Officer to take Sea Command if they are unable to draw a propulsion shaft-line diagram, so an AI or data deployment program should not be the responsibility of someone senior who does not know or understand the basics of Kubernetes or Docker.⁶ For example, when the USAF created their ‘Platform One’ AI development platform, roles were disproportionately populated by lower ranks and rates. As Nic Chaillan, then USAF Chief Software Officer, noted:

“When we started picking people for Platform One, you know who we picked? A lot of Enlisted, Majors, Captains… people who get the job done. The leadership was not used to that… but they couldn’t say anything when they started seeing the results.”⁷

The USN takes the same approach with regards to Task Force Hopper and Task Group 59.1.⁸ TF Hopper exists within the US Surface Fleet to enable rapid AI/ML adoption. This includes enabling access to clean, labeled data and providing the underpinning infrastructure and standards required for generating and hosting AI models. TG 59.1 focuses on the operational deployment of uncrewed systems, teamed with human operators, to bolster maritime security across the Middle East. Unusually, both are led by USN Lieutenants, who the the USN Chief of Naval Operations called ‘…leaders who are ready to take the initiative and to be bold; we’re experimenting with new concepts and tactics.’⁹

Delegation of Budget Spend and Reverse Mentoring

Across the Single Services, relatively junior individuals, from OR4 to OF3, could be formed on a cross-FLC basis to solve the Defence AI Playbook issues they collectively face, and free to choose which elements of their Workflow to focus on. Importantly, they should deploy Systems Thinking (i.e. an holistic, big picture approach that takes into account the relationship between otherwise discrete elements) to develop a deep understanding of the Workflow in question and prioritize deployment of the fastest, cheapest data analytics method; this will not always be an AI solution. These Guilds would need budgetary approval to spend funds, collected into a single pot from up to three separate UINs from across the Single Services; this could potentially be overseen by an OF5 or one-star. The one-star’s role would be less about providing oversight, and more to do with ensuring funds were released and, vitally, receiving reverse mentoring from the WAG members themselves about the viability and value of deploying AI for that use case.

The RN’s traditional approach to digital and cultural transformation – namely a top-down, directed approach – has benefits, but these are increasingly being rendered ineffective as the pace of technological change increases. Only those working with this technology day-to-day, and using it to solve real-world challenges, will be able to drive the cultural change the RN requires. Currently, much of this work is completed by contractors who take the experience with them when projects close. By deploying this reverse-mentoring approach, WAG’s not only cheaply create a cadre of uniformed, experienced AI practitioners, but also a senior team of Flag Officers who have seen first-hand where AI does (or does not) work and have an understanding of the infrastructure needed to make it happen.

Remote working and collaboration tools mean that teams need not be working from the same bases, vital if different FLCs are to collaborate. These individuals should be empowered to spend multi-year budgets of up to circa. £125k. As of 2024, this is sufficient to allow meaningful Discovery, Alpha, Beta and Live AI project phases to be undertaken; allow the use of COTS products; and small enough to not result in a huge loss if the project (which is spread among all three Services to mitigate risk) fails.

Figure 2: An example AI Figure 2: An example AI Opportunity Mapping exercise, where multiple AI capabilities (represented by colored cards) are mapped onto different stages of an existing workflow, to understand where, if anywhere, use of AI could enable or improve workflow automation. Source: 33A AI.

WAG Workflow Example

An example of how WAGs could work is as follows, using the oil sample contamination example. Members of the RN and RAF Wildcat and Merlin maintenance teams collectively identify the amount of manpower effort that could be saved if the physical checking of lube oil samples could be automated. With an outline knowledge of AI and Systems Thinking already held, WAG members know that full automation of this workflow is not possible; but they have identified one key step in the workflow that could be improved, speeding up the entire workflow of regular helicopter maintenance. The fact that a human still needs to manually check oil samples is not necessarily an issue, as they identify that the ability to quickly prioritize and categorize samples will not cause bottlenecks elsewhere in the workflow and thus provides a return on investment.

Members of the WAG would create a set of User Stories, an informal, general description of the AI benefits and features, written from a users’ perspective. With the advice from the DAIC, NAIC, RAF Rapid Capability Office (RCO) / RAF Digital or Army AI Centre, other members of the team would ensure that data is in a fit state for AI model training. In this use-case, this would involve labeling overall images of contamination, or the individual contaminants within an image, depending on the AI approach to be used (image recognition or object detection, respectively). Again, the use of the Defence Data Analytics Portal (DDAP), or a cheap, third-party licensed product, provides remote access to the tools that enable this. The team now holds a number of advantages over traditional, contractor-led approaches to AI deployment, potentially sufficient to cross the Valley of Death:

They are likely to know colleagues across the three Services facing the same problem, so can check that a solution has not already been developed elsewhere.¹⁰
With success metrics, labelled data and user requirements all held, the team has already overcome the key blockers to success, reducing the risk that expensive contractors, if subsequently used, will begin project delivery without these key building blocks in place.
They have a key understanding of how much value will be generated by a successful project, and so can quickly ‘pull the plug’ if insufficient benefits arise. There is also no financial incentive to push on if the approach clearly isn’t working.
Alternatively, they have the best understanding of how much value is derived if the project is successful.
As junior Front-Line operators, they benefit directly from any service improvement, so are not only invested in the project’s success, but can sponsor the need for BaU funding to be released to sustain the project in the long term, if required.
If working with contractors, they can provide immediate user feedback, speeding up development time and enabling a true Agile process to take place. Currently, contractors struggle to access users to close this feedback loop when working with MOD.

Again, Flag Officer sponsorship of such an endeavor is vital. This individual can ensure that proper recognition is awarded to individuals and make deeper connections across FLCs, as required.

Figure 3: Defense Digital’s Defense Data Analytics Portal (DDAP) is tailor-made for small, Front-Line teams to clean and label their data and deploy AI services and products, either standalone or via existing, call-off contract contractor support.

Prioritizing Quick, Hands-on Problem Solving

WAGs provide an incentive for more entrepreneurial, digitally minded individuals to remain in Service, as it creates an outlet for those who wish to learn to code and solve problems quickly, especially if the problems faced are ones they wrestle with daily. A good example of where the RN has successfully harnessed this energy is with Project KRAKEN, the RN’s in-house deployment of the Palantir Foundry platform. Foundry is a low-code way of collecting disparate data from multiple areas, allowing it to be cleaned and presented in a format that speeds up analytical tasks. It also contains a low-level AI capability. Multiple users across the RN have taken it upon themselves to learn Foundry and deploy it to solve their own workflow problems, often in their spare time, with the result that they can get more done, faster than before. With AI tools becoming equally straightforward to use and deploy, the same is possible for a far broader range of applications, provided that cross-TLB resources can be concentrated at a junior level to enable meaningful projects to start.

Figure 4: Pre-existing Data/AI products or APIs, bought from commercial providers, or shared from elsewhere in Government, are likely to provide the fastest, cheapest route to improving workflows.¹¹

Deploying COTS Products Over Tailored Services

Figure 4 shows the two main options available for WAGs when deploying AI or data science capabilities: Products or Services. Products are standalone capabilities created by industry to solve particular problems, usually made available relatively cheaply. Typically, COTS, they are sold on a per-use, or time-period basis, but cannot be easily tailored or refined if the user has a different requirement.

By contrast, Services are akin to a consulting model where a team of AI and Machine Learning engineers build an entirely new, bespoke system. This is much more expensive and slower than deploying a Product but means users should get exactly what they want. Occasionally, once a Service has been created, other users realize they have similar requirements as the original user. At this point, the Service evolves to become a Product. New users can take advantage of the fact that software is essentially free to either replicate or connect with and gain vast economies of scale from the initial Service investment.

WAGs aim to enable these economies of scale; either by leveraging the investment and speed benefits inherent in pre-existing Products or ensuring that the benefits of any home-made Services are replicated across the whole of the MOD, rather than remaining stove-piped or siloed within Single Services.

Commercial/HMG Off the Shelf Product. The most straightforward approach is for WAGs to deploy a pre-existing product, licensed either from a commercial provider, or from another part of the Government that has already built a Product in-house. Examples include the RAF’s in-house Project Drake, which has developed complex Bayesian Hierarchical models to assist with identifying and removing training pipeline bottlenecks; these are Crown IP, presumably available to the RN at little-to-no cost, and their capabilities have been briefed to DAIC (and presumably briefed onwards to the NAIC).

Although straightforward to procure, it may not be possible to deploy COTS products on MOD or Government systems, and so may be restricted up to OFFICIAL or OFFICIAL SENSITIVE only. Clearly, products developed or deployed by other parts of MOD or National Security may go to higher classifications and be accessible from MODNet or higher systems. COTS products are usually available on a pay-as-you-go, monthly, or user basis, usually in the realm of circa £200 per user, per month, providing a fast, risk-free way to understand whether they are valuable enough to keep using longer-term.

Contractor-supported Product. In this scenario, deployment is more complex; for example, the product needs to deploy onto MOD infrastructure to allow sensitive data to be accessed. In this case, some expense is required, but as pre-existing COTS, the product should be relatively cheap to deploy as most of the investment has already been made by the supplier. This option should allow use up to SECRET but, again, users are limited to those use-cases offered by the commercial market. These are likely to be focused on improving maintenance and the analysis of written or financial information. The DAIC’s upcoming ‘LLMs for MOD’ project is likely to be an example of a Contractor-supported Product; MOD users will be able to apply for API access to different Large Language Model (LLM) products, hosted on MOD infrastructure, to solve their use-cases. Contractors will process underlying data to allow LLMs to access it, create the API, and provide ongoing API connectivity support.

Service built in-house. If no product exists, then there is an opportunity to build a low-code solution in DDAP or MODCloud and make it accessible through an internal app. Some contractor support may be required, particularly to provide unique expertise the team cannot provide themselves (noting that all three Services may have Digital expertise available via upcoming specialist Reserve branches, with specialist individuals available at a fraction of the cost of their civilian equivalent day rates).¹² Defense Digital’s ‘Enhanced Data Teams’ service provides a call-off option for contractors to do precisely this for a short period of time. It is likely that these will not, initially, deploy sophisticated data analysis or AI techniques, but sufficient value may be created with basic data analytics. In any event, the lessons learnt from building a small, relatively unsophisticated in-house service will provide sufficient evidence to ascertain whether a full, contractor-built AI service will provide value for money, if built. Project Kraken is a good example of this; while Foundry is itself a product and bought under license, it is hosted in MOD systems and allows RN personnel to build their own data services within it.

Service built by contractors. These problems are so complex, or unique to Defense, that no COTS product exists. Additionally, the degree of work is so demanding that Service Personnel could not undertake this work themselves. In this case, WAGs should not be deployed. Instead, these £100k+ programs should remain the purview of Defense Digital or the DAIC and aim to instead provide AI Building Blocks that empower WAGs to do AI work. In many cases, these large service programs provide cheap, reproducible products that the rest of Defense can leverage. For example, the ‘LLMs for MOD’ Service will result in relatively cheap API Products, as explained above. Additionally, the British Army is currently tendering for an AI-enabled system that can read the multiple hand-and-type-written text within the Army archives. This negates the need for human researchers to spend days searching for legally required records that can now be found in seconds. Once complete, this system could offer itself as a Product that can ingest complex documents from the rest of the MOD at relatively low cost. This should negate the need for the RN to pay their own 7-figure sums to create standalone archive scanning services. To enable this kind of economy of scale, NAIC could act as a liaison with these wider organizations. Equipped with a ‘shopping list’ of RN use cases, it could quickly deploy tools purchased by the Army, RAF or Defense Digital across the RN.

Finding the Time

How can WAG members find the time to do the above? By delegating budget control down to the lowest level, and focusing predominantly on buying COTS products, the amount of time required should be relatively minimal; in essence, it should take the same amount of time as buying something online. Some work will be required to understand user stories and workflow design, but much of this will already be in the heads of WAG members. Imminent widespread MOD LLM adoption should, in theory, imminently reduce the amount of time spent across Defense on complex, routine written work (reviewing reports, personnel appraisals, post-exercise or deployment reports or other regular reporting).¹³ This time could be used to enable WAGs to do their work. Indeed, identifying where best to deploy LLMs across workflows are likely to be the first roles of WAGs, as soon as the ‘LLMs for MOD’ program reaches IOC. Counter-intuitively, by restricting the amount of time available to do this work, it automatically focuses attention on solutions that are clearly valuable; solutions that save no time are, by default, less likely to be worked on, or have money spent on them.

Conclusions

The RN runs the risk of spreading the NAIC’s budget too thinly, in its attempt to ‘jumpstart’ use of AI across Business as Usual disciplines. By contrast, users should be encouraged to form Workflow Automation Guilds across FLCs. Supported by a senior sponsor, knowledgeable members of the Reserves, the NAIC and one-on-one time with the DAIC, WAGs could instead focus on the COTS solution, or pre-existing Crown IP, that will best solve their problem. Budget responsibilities should be delegated down too, thereby enabling access to existing, centralized pools of support, such as the Enhanced Data Teams program, DDAP, or the upcoming ‘LLMs for MOD’ API Service. In this way, projects are more likely to succeed, as they will have demonstrated value from the very start and will have been co-developed by the very users that deploy them. The speed at which AI and data services are becoming easier to use is reflected by the RN’s Kraken team, while the need to trust low-level officers and junior rates is borne out by the success currently being enjoyed by both the USAF and USN with their own complex AI deployments.

Prior to leaving full-time service, Lieutenant Commander Francis Heritage, Royal Navy Reserve, was a Principal Warfare Officer and Fighter Controller. Currently an RNR GW officer, he works at the Defence arm of Faculty, the UK’s largest independent AI company. LtCdr Francis is now deployed augmenting Commander United Kingdom Strike Force (CSF).

The views expressed in this paper are the author’s, and do not necessarily represent the official views of the MOD, the Royal Navy, RNSSC, or any other institution.

References

1. Discovery’ is the first of 5 stages in the UK Government Agile Project Delivery framework, and is followed by Alpha, Beta, Live and Retirement. Each stage is designed to allow the overall program to ‘fail fast’ if it is discovered that benefits will not deliver sufficient value.

2. Author’s observations.

3. Volvo and the US commodities group Bureau Veritas both have Commercial off the Shelf products available for solving this particular problem.

4. Source: https://www.v7labs.com/pricing accessed 10 Apr 2024.

5. Source: https://assets.publishing.service.gov.uk/media/65bb75fa21f73f0014e0ba51/Defence_AI_Playbook.pdf

6. AI systems rely on machine learning frameworks and libraries; Docker packages these components together into reproducible ‘containers’, simplifying deployment. Kubernetes builds on Docker, providing an orchestration layer for automating deployment and management of containers over many machines.

7. Defence Unicorns podcast, 5 Mar 2024.

8. Source: Navy’s new ‘Project OpenShip’ aims to swiftly apply AI to data captured by vessels at sea | DefenseScoop.

9. https://www.afcea.org/signal-media/navys-junior-officers-lead-way-innovation.

10. The author knows of at least 3 AI projects across MOD aimed at automating operational planning and another 3 aiming to automate satellite imagery analysis.

11. API stands for Application Programming Interface, a documented way for software to communicate with other software. By purchasing access to an API (usually on a ‘per call’ or unlimited basis) a user can take information delivered by an API and combine it with other information before presenting it to a user. Examples include open-source intelligence, commercial satellite imagery, meteorological data, etc.

12. Army Reserve Special Group Information Service, RNR Information Exploitation Branch and RAF Digital Reserves Consultancy. RNR IX and RAFDRC are both TBC.

13. Worldwide, Oliver Wyman estimates Generative AI will save an average of 2 hours per person per week; likely to be higher for office-based roles: https://www.oliverwymanforum.com/content/dam/oliver-wyman/ow-forum/gcs/2023/AI-Report-2024-Davos.pdf p.17.

Featured Image: The Operations Room of the carrier HMS Queen Elizabeth during an exercise in 2018. (Royal Navy photo)

Future Tech

Alexa, Write my OPORD: Promise and Pitfalls of Machine Learning for Commanders in Combat

September 20, 2023 Guest Author 1 Comment

By Jeff Wong

Introduction

Jump into the time machine and fast forward a few years into the future. The United States is at war, things are not going well, and the brass wants to try something new. John McCarthy,¹ a Marine lieutenant colonel whose knowledge of technology is limited to the Microsoft Office applications on his molasses-slow government laptop, mulls over his tasks, as specified by his two-star boss, the commander of Joint Task Force 58:

1. Convene a joint planning group to develop a plan for the upcoming counteroffensive. (Check.)

2. Leverage representatives from every staff section and subject-matter experts participating virtually from headquarters in Hawaii or CONUS. (Roger.)

3. Use an experimental machine-learning application to support the planning and execution of the operation. (We’re screwed.)

Nearly 7,000 miles from a home he might never see again, McCarthy considered two aphorisms. The first was from Marcus Aurelius, the second-century Roman emperor and stoic thinker: “Never let the future disturb you. You will meet it, if you have to, with the same weapons of reason which today arm you against the present.” ² The second was from Mike Tyson, the fearsome boxer, convicted felon, and unlikely philosopher: “Everybody has a plan until they get punched in the mouth.”³

Artificial intelligence (AI), including large-language models (LLMs) such as ChatGPT, is driving a societal revolution that will impact all aspects of life, including how nations wage war and secure their economic prosperity. “The ability to innovate faster and better – the foundation on which military, economic, and cultural power now rest – will determine the outcome of great-power competition between the United States and China,” notes Eric Schmidt, the former chief executive officer of Google and chair of the Special Competitive Studies Project.⁴ The branch of AI that uses neural networks to mimic human cognition — machine learning (ML) — offers military planners a powerful tool for planning and executing missions with greater accuracy, flexibility, and speed. Senior political and military leaders in China share this view and have made global technological supremacy a national imperative.⁵

Through analyzing vast amounts of data and applying complex algorithms, ML can enhance situational awareness, anticipate threats, optimize resources, and adapt to generate more successful outcomes. However, as ML becomes more widespread and drives technological competition against China, American military thinkers must develop frameworks to address the role of human judgment and accountability in decision-making and the potential risks and unintended consequences of relying on automated systems in war.

To illustrate the promise and pitfalls of using ML to support decision-making in combat, imagine the pages of a wall calendar flying to some point when America must fight a war against a peer adversary. Taking center stage in this fictional journey are two figures: The first is McCarthy, an officer of average intelligence who only graduated from the Eisenhower School for National Security and Resource Strategy thanks to a miraculous string of B+ papers at the end of the academic year. The second is “MaL,” a multimodal, LLM that is arguably the most capable – yet most immature – member of McCarthy’s planning team. This four-act drama explores how McCarthy and his staff scope the problems they want MaL to help them solve, how they leverage ML to support operational planning, and how they use MaL to support decision-making during execution. The final act concludes by offering recommendations for doctrine, training and education, and leadership and policies to better harness this capability in the wars to come.

Act One: “What Problems Do We Want This Thing to Solve?”

The task force was previously known as “JTF-X,” an experimental formation that blended conventional legacy platforms with AI and autonomous capabilities. As the tides of war turned against the United States and its allies, the Secretary of Defense (SecDef) pressured senior commanders to expand the use of AI-enabled experimental capabilities. Rather than distribute its capabilities across the rest of the joint force, the SecDef ordered JTF-X into duty as a single unit to serve as the theater reserve for a geographic combatant commander. “Necessity breeds innovation… sometimes kicking and screaming,” she said.

Aboard the JTF command ship in a cramped room full of maps, satellite imagery, and charts thick with unit designators, McCarthy stared at a blinking cursor on a big-screen projection of MaL. Other members of his OPT looked over his shoulder as he impulsively typed out, “Alexa, write my OpOrd.” Undulating dots suggested MaL was formulating a response before MaL responded, “I’m not sure what you’re asking for.”

The JPG chief, an Air Force senior master sergeant, voiced what the humans in the room were thinking: “Sir, what problems do we want this thing to solve?”

The incredible capacity of ML tools to absorb, organize, and generate insights from large volumes of data suggests that they hold great promise to support operational planning. Still, leaders, planners, and ML tool developers must determine how best to harness the capability to solve defined military problems. For instance, the Ukrainian military uses AI to collect and assess intelligence, surveillance, and reconnaissance (ISR) data from numerous sources in the Russia-Ukraine conflict.⁶ But as Ukrainian planners are probably discovering today, they must do more than throw current ML tools at problem sets. Current LLMs fall short of desired capabilities to help human planners infer and make sense within the operating environment. Military professionals must tailor the problem sets to match the capabilities and limitations of the ML solutions.

Although tools supporting decision advantage comprised a small fraction of the 685 AI-related projects and accounts within the DoD as of 2021, existing efforts align with critical functions such as the collection and fusion of data from multiple domains (akin to the DoD’s vision for Joint All-Domain Command and Control (JADC2)); multidomain decision support for a combatant command headquarters; automated analysis of signals across the electromagnetic spectrum; and location of bots to support defensive cyber operations.⁷ There are numerous tools with various tasks and functions, but the crux of the problem will be focusing the right tool or set of tools on the appropriate problem sets. Users need to frame planning tasks and precisely define the desired outputs for a more immature LLM capability, such as the fictional MaL.

McCarthy mapped out a task organization for the JPG to align deliverables with available expertise. The team chief scribbled dates and times for upcoming deliverables on a whiteboard, including the confirmation brief for the commander in 24 hours. An Army corporal sat before a desktop computer to serve as the group’s primary interface with MaL. To help the group develop useful queries and troubleshoot, MaL’s developers in Hawaii participated via a secure video teleconference.

MaL was already able to access volumes of existing data – operations and contingency plans, planning artifacts from previous operations, ISR data from sensors ranging from national assets to stand-in forces in theater, fragmentary orders, and mountains of open-source information.

Act Two: MaL Gets Busy to the ‘Left of the Bang’

Some observers believe that ML capabilities can flatten the “orient” and “decide” steps of the late military theorist John Boyd’s observe-orient-decide-act decision (OODA) loop, expanding a commander’s opportunity to understand context, gain an appreciation of the battlespace, frame courses of action, and explore branches and sequels to inform decisions.⁸ Nevertheless, the greater capacity that ML tools provide does not eliminate the need for leaders to remain intimately involved in the planning process and work with the platform to define decision points as they weigh options, opportunities, risks, and possible outcomes.

Planners should examine frameworks such as the OODA Loop, IPB, and the Joint Operational Planning Process (JOPP) to guide where they could apply ML to maximum effect. To support IPB, ML can automate aspects of collection and processing, such as identifying objects, selecting potential targets for collection, and guiding sensors. ML capabilities for image and audio classification and natural language processing are already in commercial use. They support essential functions for autonomous vehicles, language translation, and transit scheduling. These commercial examples mirror military use cases, as nascent platforms fuse disparate data from multiple sources in all five warfighting domains.⁹

MaL’s digital library included the most relevant intelligence reports; adversary tactics, techniques, and procedures summaries; videos of possible target locations taken by uncrewed systems; raw signals intelligence; and assessments of the enemy orders of battle and operational tendencies from the early months of the conflict. The corpus of data also included online news stories and social media postings scraped by an all-source intelligence aggregator.

McCarthy said, “As a first step, let’s have MaL paint the picture for us based on the theater-level operational context, then create an intelligence preparation of the battlespace presentation with no more than 25 PowerPoint slides.” After the clerk entered the query, the graphic of a human hand drumming its fingers on a table appeared.

Two minutes later, MaL saved a PowerPoint file on the computer’s desktop and announced in a metallic voice, “Done, sir, done!” McCarthy and his J-2 reviewed the IPB brief, which precisely assessed the enemy, terrain, weather, and civil considerations. MaL detailed the enemy’s most likely and dangerous courses of action, discussed adversary capabilities and limitations across all domains, and provided a target-value analysis aligning with the most recent intelligence. The J-2 reviewed the product and said, “Not bad.” She added, “Should I worry about losing my job?”

“Maybe I should worry about losing mine,” McCarthy said. “Let’s go through the planning process with MaL and have it generate three COAs based on the commander’s planning guidance and intent statement.”

American military planning frameworks – JOPP and its nearly identical counterparts across the services – are systematic processes that detail the operational environment, the higher commander’s intent, specified and implied tasks, friendly and enemy COAs, and estimates of supportability across all warfighting functions. They benefit the joint force by establishing uniform expectations about the information needed to support a commander’s estimate of the situation. However, current planning frameworks may hinder decision-making because a commander and his staff may become too focused on the process instead of devoting their energies and mental bandwidth to quickly orienting themselves to a situation and making decisions. Milan Vego, a professor of operational art at the U.S. Naval War College, writes of “a great temptation to steadily expand scope and the amount of information in the (commander’s) estimate. All this could be avoided if commanders and staffs are focused on the mental process and making a quick and good decision.”¹⁰

An ML-enabled decision-support capability could help planners stay above process minutiae by suggesting options for matching available weapon systems to targets, generating COAs informed by real-time data, and assessing the likelihood of success for various options in a contemporary battlespace, which features space and cyberspace as contested warfighting domains.¹¹

MaL developed three unacceptable COAs which variously exceeded the unit’s authorities as outlined in the JTF’s initiating directive or extended kinetic operations into the adversary’s mainland, risking escalation.

McCarthy rubbed his face and said, “Time for a reboot. Let’s better define constraints and restraints, contain operations in our joint operational area, and have it develop assessments for risks to mission and force.” He added, “And this time, we’ll try not to provoke a nuclear apocalypse.”

The planning team spent several more hours refining their thoughts, submitting prompts, and reviewing the results. Eventually, MaL generated three COAs that were feasible, acceptable, complete, and supportable. MaL tailored battlespace architecture, fire support coordination measures, and a detailed sustainment plan for each COA and mapped critical decision points throughout the operation. MaL also assisted the JTF air officer develop three distinct aviation support plans for each COA.

The planning team worked with MaL to develop staff estimates for each COA. The logistics and communications representatives were surprised at how quickly MaL produced coherent staff estimates following a few hours of queries. The fires, intelligence, and maneuver representatives similarly worked with MaL to develop initial fire support plans to synchronize with the group’s recommended COA.

McCarthy marveled at MaL’s ability to make sense of large amounts of data, but he was also surprised at the ML platform’s tendency to misinterpret words. For instance, it continually conflated the words “delay,” “disrupt,” and “destroy,” which were distinct tactical tasks with different effects on enemy formations. The planning team reviewed MaL’s COA overviews and edited the platform’s work. The staff estimates were detailed, and insightful, but still required corrections.

During the confirmation brief, the JTF commander debated several details about MaL’s outputs and risk assessments of the planning team’s recommended COA. McCarthy said, “Respectfully, Admiral, this is your call. MaL is one tool in your toolbox. We can tweak the COA to suit your desires. We can also calibrate the level of automation in the kill chain based on your intent.”

After a moment, the admiral said, “I’ll follow the planning team’s recommendation. Notice that I didn’t say MaL’s recommendation because MaL is just one part of your team.”

Act Three: MaL Lends a Hand to the Right of the ‘Bang’

Contemporary military thinkers believe that ML-enabled tools could improve decision-making during the execution of an operation, leveraging better situational awareness to suggest more effective solutions for problems that emerge in a dynamic battlespace. However, critics argue that developing even a nascent ML-enabled capability is impossible because of the inherent limits of ML-enabled platforms to generate human-like reasoning and infer wisdom from incomplete masses of unstructured data emanating from a 21st-century battlefield. Some are also concerned about the joint force’s ability to send and receive data from a secure cloud subject to possible malicious cyber activities or adversarial ML. Prussian military thinker Carl von Clausewitz reminds practitioners of the operational art that “War is the realm of uncertainty; three-quarters of the factors on which action in war is based are wrapped in a fog of greater or lesser uncertainty.”¹² Technological solutions such as ML-enabled capabilities can temporarily lift parts of this fog for defined periods. Still, users must understand the best use of these capabilities and be wary of inferring certainty from their outputs.

Emerging capabilities suggest that ML can augment and assist human decision-making in several ways. First, ML systems can enhance situational awareness by establishing and maintaining a real-time common operational picture derived from multiple sensors in multiple domains. Greater battlespace awareness provides context to support better decision-making by commanders and more accurate assessments of events on the battlefield. Second, ML can improve the effectiveness and efficiency of kill-chain analytics by quickly matching available sensors and shooters with high-value or high-payoff targets.¹³ This efficiency is essential in the contemporary battlespace, where ubiquitous sensors can quickly detect a target based on a unit’s emissions in the electromagnetic spectrum or signatures from previously innocuous activities like an Instagram post or a unit’s financial transaction with a host-nation civilian contractor.

Indeed, some AI observers in the U.S. defense analytic community argue that warfighters must quickly adopt ML to maintain a competitive edge against the People’s Liberation Army, which some observers believe will favor the possible gains in warfighting effectiveness and efficiency over concerns about ethical issues such as implementing human-off-the-loop AI strategies.¹⁴ ML-enabled feedback constructs will enhance the control aspects of command and control to employ a large, adaptable, and complex multidomain force.¹⁵

It was now D+21. JTF-58 achieved its primary objectives, but the campaign did not unfold as intended. During the shaping phase of the operation, several high-value targets, including mobile anti-ship cruise missile batteries, escaped kinetic targeting efforts, living to fight another day and putting U.S. naval shipping at risk for the latter phases of the campaign. MaL failed to update the developers’ advertised “dynamic operating picture” despite attempts by forward-deployed sensors and reconnaissance teams to report their movement. Incredibly, the DoD still did not have access to data from some weapon systems due to manufacturer stipulations.¹⁶

MaL’s developers insisted that the forward-deployed sensors should have had enough computing power to push edge data to the JTF’s secure cloud. The CommO speculated that environmental conditions or adversary jamming could have affected connectivity. McCarthy shook his head and said, “We need to do better.”

MaL performed predictably well in some areas to the right of the bang. The task force commander approved using MaL to run networked force protection systems, including a Patriot battery that successfully intercepted an inbound missile and electronic-warfare (EW) systems that neutralized small unmanned aerial systems targeting a fuel farm. MaL’s use in these scenarios did not stretch anyone’s comfort level since these employment methods were barely different than the automation of systems like the U.S. Navy’s Phalanx close-in weapon system (CIWS), which has detected, evaluated, tracked, engaged, and conducted kill assessments of airborne threats for more than four decades.¹⁷

MaL’s communications and logistics staff estimates were precise and valuable for the staff. The CommO adjusted the tactical communications architecture based on MaL’s predictions about enemy EW methods and the effects of weather and terrain on forward maneuver elements. Similarly, the LogO worked with the OpsO to establish additional forward-arming and refueling points (FARPs) based on MaL’s fuel and munitions consumption projections.

In the middle of the operation, the task force commander issued a fragmentary order to take advantage of an unfolding opportunity. MaL leveraged open-source data from host-nation news websites and social media postings by enemy soldiers to inform battle damage assessment of kinetic strikes. Some of that information was fake and skewed the assessment until the intelligence officer corrected it by comparing it with satellite imagery and human intelligence reporting.

As with any emerging capability, commanders and their staffs must consider the risks of integrating ML into the planning, execution, and assessment of operations. One of the risks is inherent in forecasting, as the ML platform artificially closes the feedback loop to a decision-maker sooner than one would expect during real-world operations. Retired U.S. Navy Captain Adam Aycock and Naval War College professor William Glenney IV assert that this lag might make ML outputs moot when a commander makes a decision. “The operational commander might not receive feedback, and the effects of one move might not be recognized until several subsequent moves have been made,” Aycock and Glenney write. “Furthermore, a competent enemy would likely attempt to mask or manipulate this feedback. Under such circumstances … it is difficult to ‘learn’ from a first move in order to craft the second.”¹⁸

Another risk is that the data used by ML platforms are, in some way, inaccurate, incomplete, or unstructured. Whether real or training, flawed data will lead to inaccurate outputs and likely foul an ML-enabled tool’s assessment of the environment and COA development. “Unintentional failure modes can result from training data that do not represent the full range of conditions or inputs that a system will face in deployment,” write Wyatt Hoffman and Heeu Millie Kim, researchers with the Center for Security and Emerging Technology at Georgetown University. “The environment can change in ways that cause the data used by the model during deployment to differ substantially from the data used to train the model.”¹⁹

The corollary to inaccurate data is adversarial ML, in which an enemy manipulates data to trick an ML system, degrade or disrupt optimal performance, and erode users’ trust in the capability. Adversarial ML tricks can trick an ML model into misidentifying potential targets or mischaracterizing terrain and weather. In one notable demonstration of adversarial ML, researchers at the Chinese technology giant Tencent placed stickers on a road to trick the lane recognition system of a Tesla semi-autonomous car, causing it to swerve into the wrong lane.²⁰ Just the possibility of a so-called “hidden failure mode” could exacerbate fears about the reliability of any ML-enabled system. “Operators and military commanders need to trust that ML systems will operate reliably under the realistic conditions of a conflict,” Hoffman and Kim write. “Ideally, this will militate against a rush to deploy untested systems. However, developers, testers, policymakers, and commanders within and between countries may have very different risk tolerances and understandings of trust in AI.”²¹

Act Four: Hotwash

McCarthy took advantage of an operational pause to conduct a hotwash. Over lukewarm coffee and Monsters, the conversation gravitated toward how they could use MaL better. The group scribbled a few recommendations concerning integrating ML into doctrine, training and education, and leadership and policies until the ship’s 1-MC sounds general quarters.

Doctrine: To realize the utility of ML, military leaders should consider two changes to existing doctrine. First, doctrine developers and the operational community should consider the concept of “human command, machine control,” in which ML would use an auction-bid process akin to ride-hailing applications to advertise and fulfill operational requirements across the warfighting functions. Under this construct, a commander publishes or broadcasts tasks, including constraints, priorities, metrics, and objectives. “A distributed ML-enabled control system would award-winning bids to a force package that could execute the tasking and direct the relevant forces to initiate the operation,” write naval theorists Harrison Schramm and Bryan Clark. “Forces or platforms that accept the commander’s bid conducts (or attempts to conduct) the mission and reports the results “when communications availability allows.”²² This concept supports mission-type orders/mission command and allows C2 architectures to flex to instances and areas subject to low-bandwidth constraints.²³

Second, doctrine developers should adjust joint, service, and domain-centric planning processes to account for additional planning aids, such as LLMs, modeling and simulation, and digital twins, which can more deeply explore COAs, branches, and sequels and accelerate understanding of threats and the operating environment. Explicitly changing planning doctrine to account for these emerging capabilities will encourage their use and emphasize their criticality to effective planning.

Training and Education: Tactical units must train and continually develop ML technical experts capable of conducting on-the-spot troubleshooting and maintenance of the tool. Meanwhile, the services should develop curricula to train budding junior leaders — corporals, sergeants, lieutenants, ensigns, and captains — that introduce them to machine-learning tools applicable to their warfighting domains, provide best principles for generating productive outputs, and articulate risks – and risk mitigations – due to skewed data and poor problem framing.

Best practices should also be documented and shared across the DoD. Use of ML capabilities should become part of a JPG’s battle drill, featuring a designated individual whose primary duty is to serve as the human interface with a decision-support tool such as MaL. Rather than work from scratch at the start of every planning effort, JPGs should have a list of queries readily available for a range of scenarios that can inform a commander’s estimate of the situation and subsequent planning guidance, and formulation of intent based on an initial understanding of the operating environment. Prompts that solicit ML-enabled recommendations on task organization, force composition and disposition, risks to force or mission, targeting, and other essential decisions should be ready for immediate use to speed the JPG’s planning tempo and, ultimately, a unit’s speed of action as well. The information management officer (IMO), which in some headquarters staffs is relegated to updating SharePoint portals, should be the staff’s subject matter expert for managing ML capabilities. IMOs would be the military equivalent of prompt engineers to coax and guide AI/ML models to generate relevant, coherent, and consistent outputs to support the unit’s mission.²⁴

Leadership and Policies: There are implications for senior leaders for warfighting and policy. Within a warfighting context, senior defense leaders must identify, debate, and develop frameworks for how commanders might use ML to support decision-making in wartime scenarios. It seems intuitive to use a multimodal LLM tool such as the fictitious MaL to support IPB, targeting, and kill chain actions; in the same way, campaign models are used to inform combatant commander planning for crises and contingencies.

However, leaders and their staffs must also understand the limitations of such tools to support a commander’s decision-making. “Do not ask for an AI-enabled solution without first deciding what decision it will influence and what the ‘left and right limits’ of the decision will be,” Schramm and Clark warn.²⁵ Likewise, AI might not be the appropriate tool to solve all tactical and operational problems. “Large data-centric web merchants such as Amazon are very good at drawing inferences on what people may be interested in purchasing on a given evening because they have a functionally infinite sample space of previous actions upon which to build the model,” they write. “This is radically different from military problems where the amount of data on previous interactions is extremely small and the adversary might have tactics and systems that have not been observed previously. Making inference where there is little to no data is the domain of natural intelligence.”²⁶

Meanwhile, future acquisition arrangements with defense contractors must provide the DoD with data rights – particularly data generated by purchased weapon systems and sensors – to optimize the potential of ML architecture in a warfighting environment.²⁷ Such a change would require the DoD to work with firms in the defense industrial base to adjudicate disagreements over the right to use, licensing, and ownership of data – each of which might bear different costs to a purchaser.

Epilogue

Technologists and policy wonks constantly remind the defense community that the Department must “fail fast” to mature emerging technologies and integrate them into the joint force as quickly as possible. The same principle should guide the development of AI/ML-enabled warfighting solutions. Commanders and their staffs must understand that this is a capable tool that, if used wisely, can significantly enhance the joint force’s ability to absorb data from disparate sources, make sense of that information, and close kill chains based on an ML tool’s assessment.

If used unwisely, without a solid understanding of what decisions ML will support, the joint force may be playing a rigged game against a peer adversary. ML-enabled capabilities can absorb large amounts of data, process and organize it, and generate insights for humans who work at a relative snail’s pace. However, these nascent tools cannot reason and interpret words or events as a competent military professional can. As strategic competition between the United States and China intensifies over Taiwan, the South China Sea, the Russian-Ukraine war, and other geopolitical issues, American political and military leaders must develop a better understanding of when and how to use ML to support joint force planning, execution, and assessment in combat, lest U.S. service members pay an ungodly sum of the butcher’s bill.

Lieutenant Colonel Jeff Wong, a U.S. Marine Corps reserve infantry officer, studied artificial intelligence at the Eisenhower School for National Security and Resource Strategy, National Defense University in the 2022-2023 academic year. In his civilian work, he plans wargames and exercises for various clients across the Department of Defense.

The views expressed in this paper are those of the author and do not necessarily reflect the official policy or position of the National Defense University, the Department of Defense, or the U.S. Government.

References

1. The fictional hero of this story, John McCarthy, is named after the Massachusetts Institute of Technology researcher who first coined the term “artificial intelligence.” Gil Press, “A Very Short History of Artificial Intelligence,” Forbes, December 30, 2016, https://www.forbes.com/sites/gilpress/2016/12/30/a-very-short-history-of-artificial-intelligence-ai/?sh=51ea3d156fba.

2. Marcus Aurelius, Meditations, audiobook.

3. Mike Berardino, “Mike Tyson Explains One of His Most Famous Quotes,” South Florida Sun-Sentinel, November 9, 2012, https://www.sun-sentinel.com/sports/fl-xpm-2012-11-09-sfl-mike-tyson-explains-one-of-his-most-famous-quotes-20121109-story.html.

4. Eric Schmidt, “Innovation Power: Why Technology Will Define the Future of Geopolitics,” Foreign Affairs, March/April 2023.

5. “Military-Civil Fusion and the People’s Republic of China,” U.S. Department of State, May 2020.

6. Eric Schmidt, “Innovation Power: Why Technology Will Define the Future of Geopolitics,” Foreign Affairs, March/April 2023.

7. Wyatt Hoffman and Heeu Millie Kim, “Reducing the Risks of Artificial Intelligence for Military Decision Advantage,” Center for Security and Emerging Technology Policy Paper (Washington, D.C.: Georgetown University, March 2023), 12.

8. James Johnson, “Automating the OODA Loop in the Age of AI,” Center for Strategic and International Studies, July 25, 2022, https://nuclearnetwork.csis.org/automating-the-ooda-loop-in-the-age-of-ai/.

9. Hoffman and Kim, “Reducing the Risks of Artificial Intelligence for Military Decision Advantage,” 7.

10. Milan Vego, “The Bureaucratization of the U.S. Military Decision-making Process,” Joint Force Quarterly 88, January 9, 2018, https://ndupress.ndu.edu/Publications/Article/1411771/the-bureaucratization-of-the-us-military-decisionmaking-process/.

11. Hoffman and Kim, 7.

12. Carl von Clausewitz, On War, ed. and trans. Michael Howard and Peter Paret (Princeton: Princeton University Press, 1976), 101.

13. Hoffman and Kim, 7.

14. Elsa Kania, “AI Weapons” in China’s Military Innovation, Brookings Institution, April 2020.

15. Harrison Schramm and Bryan Clark, “Artificial Intelligence and Future Force Design,” in AI at War (Annapolis, Md.: Naval Institute Press, 2021), 240-241.

16. Josh Lospinoso, Testimony on the State of Artificial Intelligence and Machine Learning Applications to Improve Department of Defense Operations before the Subcommittee on Cybersecurity, U.S. Senate Armed Services Committee, April 19, 2023, https://www.armed-services.senate.gov/hearings/to-receive-testimony-on-the-state-of-artificial-intelligence-and-machine-learning-applications-to-improve-department-of-defense-operations. “Today, the Department of Defense does not have anywhere near sufficient access to weapon system data. We do not – and in some cases, due to contractual obligations, the Department cannot — extract this data that feeds and enables the AI capabilities we will need to maintain our competitive edge.”

17. MK15 – Phalanx Close-In Weapon System (CIWS), U.S. Navy, September 20, 2021, https://www.navy.mil/resources/fact-files/display-factfiles/article/2167831/mk-15-phalanx-close-in-weapon-system-ciws/.

18. Adam Aycock and William Glenney IV, “Trying to Put Mahan in a Box,” in AI at War, 269-270.

19. Hoffman and Kim, CSET Policy Brief, 8.

20. Ibid, 8-9.

21. Ibid, 11.

22. Schramm and Clark, “Artificial Intelligence and Future Force Design,” 239-241.

23. AI at War, 241.

24. Craig S. Smith, “Mom, Dad, I Want To Be A Prompt Engineer,” Forbes, April 5, 2023, https://www.forbes.com/sites/craigsmith/2023/04/05/mom-dad-i-want-to-be-a-prompt-engineer/amp/.

25. AI at War, 247-248.

26. AI at War, 248.

27. Heidi M. Peters, “Intellectual Property and Technical Data in DoD Acquisitions,” Congressional Research Service In-Focus, IF12083, April 22, 2022, https://crsreports.congress.gov/product/pdf/IF/IF12083.

Featured Image: PHILIPPINE SEA (Sept. 22, 2020) Cpl. Clayton A. Phillips, a network administrator with Marine Air Control Group 18 Detachment, 31st Marine Expeditionary Unit (MEU), and a native of Beech Bluff, Tennessee, tests the connectivity of the Networking On-the-Move Airborne (NOTM-A) communications system during flight operations from the amphibious assault ship, USS America (LHA 6). (U.S. Marine Corps photo by Lance Cpl. Brienna Tuck)

Future Tech

@Channel – A Dialogue Concerning Kill Webs

February 3, 2022 Guest Author Leave a comment

By The Naval Constellation

The Naval Constellation is an online, unofficial forum resident on the team communication application Slack. The group includes Navy, Marine Corps, and Coast Guard officers, enlisted, and civilians, and serves as a place to break down organizational silos and facilitate conversation on topics ranging from innovation to strategy, emerging technology, and more. While it has existed and grown for six years, we are now partnering with CIMSEC on an enduring public series, “@Channel,” to be published as the conversation warrants it. Contact information to join future conversations like this can be found at the bottom of the article.

The below is a discussion from the Constellation between 11 participants identified by first name. It has been lightly edited for clarity. All content is submitted with the participants’ consent.

Shane: @channel, Kyle and I are discussing how the Joint Force ought to think about warfare as disabling or breaking down kill webs rather than kill chains; that these webs should be thought of almost as complex adaptive systems (CASs), not multiple reducible chains of effects. What’s the best mental model for attacking or defending against kill webs in war?

Jason: Kill chains are single path, unidirectional, and fairly fragile. Kill webs are multipath, multi-directional, and resilient. To take out a web you have to affect the node and surrounding nodes… That means you need to affect proximal relationships, not just specific targets. Counterintuitively, precision fires don’t work on kill webs. You need an area of effect and less precise methods.

Throw a rock through the spider web, don’t clip individual strands.

Figure 1 – Depiction of DARPA’s Adapting Cross-Domain Kill-Webs (ACK) Program.

Chris: I think of kill chains as being the decision process. We are hindered by the chain of command and decision process that is fragile effect and does have a single path. Kill webs are what Jason described above, but they are aspirational right now. I’d argue that most of our systems are not part of kill webs and are still really fragile. Are we that adaptive? Do our systems actually work that way outside of a very structured exercise?

And if they are, will our Command and Control (C2) process actually let us operate like that?

Jason: Kill webs take a lot of energy to maintain. They are resilient, but if disrupted are more difficult to restore. I recommend a hybrid approach.

Kyle: Would you describe China’s A2AD networks as kill webs? Is there any difference?

Jason: Certainly. Highly resilient, taking out a single node won’t affect the web, but a significant enough, broad disruption, and the system would be hard to reconstitute.¹

*Figure 2 – An interpretation of the kill web-centric Mosaic Warfare concept for a Chinese audience, published in the April 2021 edition of the PLA journal Aero Weaponry.*

Chris: Jason, do you design a kill web so that it fails to a chain? Or do you choose what systems are part of a web and which are a chain?

Jason: It needs to fail to chains. That’s essentially what graceful degradation is… The reduction of nodal complexity until the issue is solved. Isolate the system, identify the problem, and reconstitute the system. For example, we should actually practice moving from battlegroup, coordinated operations to a unit, disconnected independent ops, and back again. The back again shouldn’t be “the network is back.” That’s unrealistic.

Instead, its independent ops, becoming two ships talking, becoming a group of ships coordinating, becomes battlegroup operations.

We actually do this in damage control. Think about engineering redundancy. We make the system more complicated than it has to be by building redundant systems. Two of the same system with a series of crossovers and disconnects. If something leaks in the system, you can isolate a portion without fully shutting down the system. You may have some penalties to efficiency… Can only operate with a 50% flow rate for example… But the system keeps working and furthermore… The degradation makes the system simpler to troubleshoot and operate.

Shifting back to the more complex operations should be deliberate – just because you fixed one thing, doesn’t mean there aren’t secondary issues you will find when you bring it back up to a higher level of complexity… Open the isolation valves too fast and you end up with 100% flow against potential unknown issues you haven’t fixed yet.

Shane: Do we do enough to train the Information Warfare Community (IWC) in how to understand and potentially disrupt, degrade, deny or destroy complex systems like kill webs?

Ryan: We absolutely do not do enough to train the IWC in this.

That’s the issue with webs and warfare at the liminal edge. We believe that shooting a few multimillion-dollar surface-to-air missiles (SAM) at an ISR (Intelligence, Surveillance and Reconnaissance) drone is Distributed Maritime Operations (DMO), or that mixing up a CVW (Carrier Air Wing) composition enables more resiliency. Neither are webs – they are chains – and we think of warfare in a chain mentality.

The Navy fundamentally lacks the ability to see outside the cave and assess how cyber or info ops might result in degraded C2 from a geographic node in the web. Or space-based effects. Nor would JFMCCs (Joint Force Maritime Component Commanders) know how to employ that kind of stuff at the Fleet or AOR (Area of Operations) level.

Kurt: I’d argue no one in the USG (United States Government) can assess cyber ops, and no one in the US military (except maybe the bubbas at CYBERCOM) trust that cyber can reliably deliver the right effect at the right time for cyber to be selected before a traditional kinetic option is selected.

Ryan: I think breaking up kill webs requires a truly joint effort. That’s not something we can expect our CSGs (Carrier Strike Groups) or single DDGs (Guided Missile Destroyers) to determine and execute alone.²

China, on the other hand, does this really well with its joint structure and national technical means. We would do well to think on how that can be broken down, and how a similar construct “with American characteristics” might be developed to serve attacking CASs.³

Kyle: So as much as I understand the concepts we’re applying here, what I struggle with is assigning the WHAT to invest in and WHO needs WHAT training.

Is this Fleet staffs? Individual ships or units? Everyone? What are we buying that would turn our chains into webs, and is that even feasible with our current acquisition process? Isn’t a “Cloud of Fire Control Data” sort of what you want?⁴

Kurt: I think the Navy can’t do it alone.

Alex: Kurt, because the Navy doesn’t have the assets to populate a web, i.e. “this is inherently ‘Joint’”? Or because culturally the Navy is resistant to implementing the organizational changes to exploit a web should it exist?

Ryan: Yes to both.

Kurt: Maybe I don’t know what you all mean by “web.” I’m assuming you mean some ultra-resilient system (like a mesh network) exists that can absorb significant damage before a worrisome degradation of capability occurs. And if that damage isn’t adequately and appropriately applied, the system would shrug off the attack with very little actual impact to the overall system.

Kurt: I said the Navy can’t do it alone because the Navy almost assuredly doesn’t have the fielded capabilities to unilaterally destroy what I (perhaps incorrectly) am assuming is a multi-domain, multi-modal “kill web.” Ignoring everything else, the way the DoD (Department of Defense) allocates (or doesn’t, as it happens) cyber forces would preclude Navy cyber from having a major role in fights occurring outside certain geographic boundaries.

Alex: Ah, the Navy can’t attack/dismantle an adversary kill web alone. I think there’s a parallel discussion about implementing and using blue force kill webs.

Louis: I like the term “ecosystem”, which brings to mind perhaps “full-spectrum ecological attack” or something. Military eco-collapse.

Matt: I think that the idea of webs will require different C2 and ways to think about sensor/weapon/target pairings. We can’t allocate a set of assets for a single or small set of targets unless we want a very narrow web.

I think that chains can be a way to test and think about paths through the web but they’ll always need to have follow-up analysis as a web. In aerospace engineering (AE) we used to have a joke that you know you’re an AE if you’ve represented a wing as a plane, the plane as a line, and the line as a point (to simplify the calculation). The same holds, you can test paths through the web exquisitely using analysis and might have to run exercises as exquisitely planned chains or small webs, but we should war game and do larger [Modelling and Simulation] as webs. I worry about webs though, if we need the always-on comms to make them work. That means they’re fragile and our already targeted comms infrastructure will be an easy point of failure.

Mike: In my mind, having multiple comms channels is part of what would make a web a ‘web’.

Matt: I agree, but those might be very low bandwidth or short duration or one way rather than bi-directional. We don’t just turn on “Uber comm path 1” and know that will link everyone with the same protocols.

Mike: It worked in World War Two, some of the naval battles in the Slot [during the Guadalcanal campaign] came down to real-time unencrypted bridge-to-bridge radio to share info across the Fleet rapidly.

Matt: Absolutely think of them as complex adaptive systems, but if you do then you have to accept that any given path through the web might not give you the same performance twice, or that the assets you thought you’d have for a given mission might be taken over by another use of the web. This is where the joint piece comes in, the bigger the web the more possible elements in any given role/roles.

Practicing as a web will be hard though. We need MMO (massively multiplayer online) wargaming that is always on. I should be able to log in with the rest of my DDG or on my own, and should be able to find other players online. We would also need some way to figure out classification so we could play red as accurately as possible without giving up methods and sources. We should have stats that we can download and leader boards and tutorial sessions that are always available. Have practice where people can try out ideas free of criticism and then have more serious competitions where people are graded and that data-rich environment is plumbed to find out how to do better.

This is going to be expensive, in terms of dollars and resources, but would be well worth it. We could/should have layers of detail, so someone can play a simpler game against

Others in one on one or play in very detailed complex many on many battles.

Nick: Agreed, and you can’t benefit from AI-enabled reinforcement learning (think AlphaGo)⁵ to make tactical and operational recommendations without building a robust modeling and simulation environment first.

So, this virtual environment has to essentially mirror the real-world COP (Common Operational Picture), or at least have exposed APIs (Application Programming Interfaces) that are formatted similarly, so you can train a model in a simulated environment and immediately deploy that same model into an operational context.

Matt: Use peoples’ play to train the AI as you go?

Nick: You could do it either way. You have a rewards/punishments system, where let’s say people play China, and fight to China’s tactics and capabilities. A model can then iterate through millions of blue force responses to find the one with the highest probability of success and make those as recommendations. Another method is you can program China’s tactics in as broad rules, but have it be an AI instead of people, so the two models learn from each other. The reason for machine learning is that there are so many exponentially complex scenarios in the responses, you can’t necessarily try all of them every time, so you train a model that can estimate the best response without needing to try every scenario. That was the significance of the AlphaGo achievement in beating the world champion at Go.⁶

Once a model is trained though, you can deploy it to look at real-world situations and make recommendations for blue force actions based on its training in the simulated environment.

Google's AlphaGo beats Go master — *Figure 3 – Google’s AlphaGo beats Go master Lee Se-dol in 2016.*

Alex: I think this is the end state that LVC (Live, Virtual, Constructive) training should aim for. Imagine if off each coast there was a persistent virtual battlespace that you could “log” into, either with a ship, aircraft, or submarine (not sure how LVC is thought to work for subs, if at all).

Nick: Also, having no experience or interaction with JADC2 (Joint All-Domain Command and Control), would someone mind explaining how that initiative relates to the kill web concept?[7] I was under the impression that the JADC2 initiative was trying to help solve that problem.

Shane: JADC2 should, in theory, give you the technical ability to link together currently non-compatible networks, sensors, and weapons systems. But from what I can gather JADC2 is pretty much only about that technical ability. There’s no attempt to get people to conceptualize attacking webs as opposed to just attacking a series of chains. That’s not really a technical thing, it’s more a doctrine, training, and TTPs (Tactics, Techniques, and Procedures) thing.

That half seems to be sorely lacking.

If you would like to join the conversation at the Naval Constellation, please email: [email protected].

Endnotes

1. “Military and Security Developments Involving the People’s Republic of China 2020, Annual Report to Congress.” Office of the Secretary of Defense. https://media.defense.gov/2020/Sep/01/2002488689/-1/-1/1/2020-DOD-CHINA-MILITARY-POWER-REPORT-FINAL.PDF

2. Captain Carmen Degeorge, U.S. Coast Guard, Commander Nathaniel Schick, U.S. Navy, and Lieutenant Colonel Jimmy Wilson And Majors Chad Buckel And Brian Jaquith, U.S. Marine Corps. “Naval Integration Requires A New Mind-Set”. 2021. U.S. Naval Institute. https://www.usni.org/magazines/proceedings/2021/october/naval-integration-requires-new-mind-set-0.

3.” Military and Security Developments Involving the People’s Republic of China 2020, Annual Report to Congress.” Office of the Secretary of Defense. https://media.defense.gov/2020/Sep/01/2002488689/-1/-1/1/2020-DOD-CHINA-MILITARY-POWER-REPORT-FINAL.PDF

4. Shelbourne, Mallory. 2020. “Navy’s ‘Project Overmatch’ Structure Aims To Accelerate Creating Naval Battle Network – USNI News”. USNI News. https://news.usni.org/2020/10/29/navys-project-overmatch-structure-aims-to-accelerate-creating-naval-battle-network.

5. “Google AI Defeats Human Go Champion”. 2017. BBC News. https://www.bbc.com/news/technology-40042581.

6. Ibid.

7. “Joint All-Domain Command and Control (JADC2)”. Congressional Research Service. July 1, 2021. https://crsreports.congress.gov/product/pdf/IF/IF11493

Featured Image: PHILIPPINE SEA (Jan. 22, 2022) An F-35C Lightning II, assigned to the “Black Knights” of Marine Fighter Attack Squadron (VMFA) 314, and an F/A-18E Super Hornet, assigned to the “Tophatters” of Strike Fighter Squadron (VFA) 14, fly over the Philippine Sea. (U.S. Navy photo by Mass Communication Specialist 2nd Class Haydn N. Smith)

Future Tech

Solving Communications Gaps in the Arctic with Balloons

August 23, 2021 Guest Author Leave a comment

Emerging Technologies Topic Week

By Walker D. Mills

Defined by their remoteness and extreme climate, the polar regions present an array of tactical and operational challenges to US forces as sea icing, repeated thawing and freezing cycles, permafrost, and frequent storms can complicate otherwise simple operations. However, often overlooked are the challenges to communications, which are critical to Navy and Coast Guard vessels operating in the polar regions. Perhaps once possible to ignore, these challenges are becoming more pressing as the Marines, Navy and Coast Guard increase their operations at higher latitudes and place more emphasis on the arctic and more arguments are made for sending Marines and soldiers to the arctic for training and presence. In order for US naval forces to compete in the polar regions and fight if needed, the military needs to invest in persistent and reliable communications capabilities. One solution is high-altitude balloons.

Arctic experts have long understood the difficulty of communicating in the arctic, noting that “While communicating today might be easier than it was for Commodore Perry 111 years ago, it’s not that much better.” Arctic communications are especially difficult for a number of reasons. Satellite-based options are limited or nonexistent because the vast majority of satellites maintain equatorial orbits, which means the polar region’s extreme latitudes fall outside satellite range. Though a few satellites follow non-equatorial orbits, there are simply not enough to provide continuous connectivity at the bandwidth needed for modern operations.

There are also natural barriers to communications in the arctic. The ionosphere covering the polar regions has a high-level of electron precipitation, which is the same characteristic that produces the Northern Lights. However, this interferes with and degrades the high-frequency (HF) radios that the military normally uses for long-range communications in the absence of satellites. Additionally, the extreme climate and cold weather in the arctic presents another challenge to communications infrastructure such as antennas and ground stations. Arctic conditions make it harder to access and maintain ground arrays, batteries expire faster in colder temperatures, and equipment can easily be buried by falling snow and lost.

Finally, the near complete lack of civilian infrastructure complicates arctic communications. The polar regions comprise about eight percent of the earth’s surface, accounting for over 10 million square miles of land on which only about 4 million people live. Most are clustered in small communities, resulting in sparse commercial communications infrastructure across the region. However, persistent and reliable communications are absolutely essential for the successful employment of maritime forces in the arctic.

One solution is for naval forces to use high-altitude balloons that provide temporary communications capabilities. Balloons are far cheaper than satellites and much more responsive. They can be quickly deployed where coverage is needed and fitted with communications payloads specific to the mission. They are also low-cost and effective enough that they can be used not only in operations but also in training at austere locations.

Balloons offer a degree of flexibility critical for operations in remote environments like the arctic. Differently sized balloons can be fitted with specific capacities for mission-tailored requirements and priorities. The size of payload, loiter time, and capabilities are primarily a function of balloon size. Large balloons and stratospheric airships can stay aloft for months, while smaller “zero pressure” balloons might last hours or a few days. Given their diverse uses and capabilities, high-altitude balloons have already been used to provide communications in hard-to-access environments by organizations such as NASA, the US Air Force, and Google. For example, researchers at the Southwest Research Institute and NASA have supported atmospheric balloon flights over the poles that lasted up to a month – more than enough time to meet operational needs.

Though there are various ways to launch and lift high-altitude balloons, recent advances show that hydrogen gas is the best candidate. Researchers at the Massachusetts Institute of Technology’s Lincoln Laboratory recently discovered a new way to generate hydrogen with aluminum and water. With this new ‘MIT process,’ researchers have already demonstrated the ability to fill atmospheric balloons with hydrogen in just minutes – a fraction of the time it takes using other methods. The MIT process promises to be not just faster, but also cheaper and safer than other methods of hydrogen generation. It also means that units can generate hydrogen at the point of use – obviating the need to store or transport the volatile gas or other compressed gasses. The researchers have demonstrated effective hydrogen generation with scrap and recycled aluminum and with non-purified water including coffee, urine, and seawater.

The deployment of balloons utilizing this new hydrogen generation process would be extremely simple. A balloon system could conceivably be developed where the system is simply dropped into the ocean from a ship, airplane, or helicopter with a mechanism that causes it to self-deploy when it comes into contact with seawater. This single system – one that does not require stores of compressed gas or an electrolyzer to generate hydrogen – would also take up far less space than other balloons and the associated equipment required to get them aloft. Balloons full of hydrogen gas could also act as giant batteries as the hydrogen can also be used to power communications equipment or sensors.

So far, the US Coast Guard has been leading the way with arctic communications. The service has highlighted improving communications in the arctic as part of their first line of effort in the 2019 Arctic Strategic Outlook and as a key initiative in their 2015 Arctic Strategy Implementation Plan. Along with the Marine Corps, the service has also been experimenting with Lockheed’s Mobile User Objective System (MUOS), a next-generation satellite communication constellation intended to replace the constellation that the Pentagon relies on today. But even the systems’ creators are clear that in extreme polar regions, MOUS may only offer eight hours of coverage per day. Constellations of small and cheap cube satellites might also be a partial fix for the communications dead zones, but hundreds or thousands would be required to cover a region as large as the arctic. The Army and the Air Force are also interested and intend to invest $50 million each toward arctic communications. The Army has previously experimented with using high-altitude balloons to support multi-domain operations and might be a key partner in developing an arctic communications capability, and the Air Force is looking at using commercial broadband satellites to meet service and joint communications needs in the arctic.

Communications issues are a consequence of the polar operating environment and an obstacle for the military services operating there. But just because the environment is difficult does not mean that US forces have to go without persistent and reliable communications. High-altitude balloons could plug the communications gap not just for maritime forces but also for the Army and special operations units operating in these extreme latitudes. Developing and deploying high-altitude communications balloons, lifted by hydrogen gas generated by the MIT process, offers near-term capability for US forces operating in polar regions with underdeveloped communications infrastructure.

Walker D. Mills is a U.S. Marine Corps officer serving as an exchange officer in Cartagena, Colombia, the 2021 Military Fellow with Young Professionals in Foreign Policy, a non-resident WSD-Handa Fellow at Pacific Forum, and a Non-Resident Fellow with the Brute Krulak Center for Innovation and Future War.

The views expressed are his alone and do not represent the United States government, the Colombian government, the United States military, or the United States Marine Corps.

Feature Image: A NASA long duration balloon is prepared for launch on Antarctica’s Ross Ice Shelf near McMurdo Station in 2004. (NASA photo)

Center for International Maritime Security

Category Archives: Future Tech

Grassroots AI: Driving Change in the Royal Navy with Workflow

Alexa, Write my OPORD: Promise and Pitfalls of Machine Learning for Commanders in Combat

@Channel – A Dialogue Concerning Kill Webs

Solving Communications Gaps in the Arctic with Balloons

Fostering the Discussion on Securing the Seas.