This section summarizes the challenges that MNOs may face when deploying and develo** the O-RAN architecture and the future research directions in response to the challenges. We will divide this section into two parts. First, we mention issues regarding the O-RAN architecture building blocks, covering the design and implementation challenges and future research directions for each of them. After that, we will mention issues about the Open RAN field as a whole.
6.1. O-RAN Alliance Architecture Issues
In this subsection, we will mention some issues regarding the O-RAN Alliance architecture and also its implementation in OSC. A common problem that occurs upon the entire O-RAN building blocks is that the standard specifications are not finished being written yet by the O-RAN Alliance. We have mentioned several nodes that still have not complete specifications in
Section 4, which are SMO’s Non-RT RIC ML, O-FHI M-Plane, O1 interface, and O2 interface. OSC’s code development for the currently available specifications is also not finished yet, as mentioned in
Section 4. Some of them have not or have just started like the O-Cloud. Some of them are postponed so that OSC can focus on develo** other nodes, such as the O-CU and the O-DU MAC SCH. Others are waiting for the official specification release, such as the M-Plane.
SMO has an issue regarding copyright with 3GPP. Since some of the O-RAN specifications also reference 3GPP specifications, the occurrence of this issue is inevitable. Currently, this copyright issue focuses on the OAM part. However, O-RAN Alliance WG1 also plans to resolve this issue on the overall O-RAN architecture level. Besides these implementation problems, SMO also faces some design challenges and future research directions that we will explain in the following paragraphs.
Network densification causes the management and optimization of neighboring cell relationships more complex. O-RAN-based proactive automatic neighbor relations (ANR) optimization is a well-known application of SON that is used to manage neighbor cell relationships, or called neighbor cell relation table (NCRT) [
69]. An ANR technique is proposed that identifies the trends before handover failure, marks the cells for handover prohibition, and identifies time-based trends to remove and add back the cells in the NCRT table by [
69]. The ML-embedded ANR model is made as an rApp. The rApp has tasks to give several data updates to its neighbor cell. These data will also be used by rApp to improve default cell removal and prohibit any possible handover policies by generating a new policy. The rApp will send the new policy to xApp in Near-RT RIC. After that, the xApp executes the policy on Open RAN network nodes. This design will minimize the handover failure, reduce the operational cost, and thus increase the network’s performance and QoS.
One of the problems in network optimization/automation is that even though the objective is clear, there currently needs to be a practical way to select which intelligence model should be deployed, where to deploy it, and which RAN parameters to control. A rApp called OrchestRAN is proposed [
70]. The OrchestRAN framework offers to solve this. MNOs can directly specify high-level control/inference objectives, and OrchestRAN will automatically compute the optimal set of algorithms and where to execute them. OrchestRAN prototype is tested on the Colosseum. Experiment results show that OrchestRAN can instantiate data-driven services on demand at different network nodes and time scales. OrchestRAN successfully achieved minimal control overhead and latency.
Another method proposed by researchers for SMO is to improve the RRM. The proposed intelligent RRM scheme is called cell splitting. The application of cell splitting is made by [
71], where the cell splitting is involved in the Open RAN architecture to prevent cell congestion. Long short-term memory (LSTM) recurrent neural network (RNN) is utilized to learn, thus making several predictions about the traffic pattern that could arise from a real-world cellular network in a densely populated area. From these traffic patterns, LSTM will detect any cell that may be congested. LSTM model is trained at Non-RT RIC using long-term data gathered from the RAN. This cell-splitting architecture has several open problems that need to be solved in the future, such as the dynamic capability and flexibility of the model, the complex inference model, security risks, the requirement of extreme data rate, and the interoperability of multi-vendor elements.
Near-RT RIC’s current implementation of the TS xApp relies on the currently available AD and QP xApps. TS is critical in the RAN. Better ML models with higher accuracy should be applied in the AD and QP xApps. This will improve the overall TS xApp performance. Another problem arises from the design of the present O-RAN RIC. RIC cannot understand the connections between the same subscriber or other UE. Instead, RIC only owns high-level insights in a connection. This increases the risk of false-positive alarms/actions, especially without IMSI/IMEI correlation across connections. Some more suggestions are also raised by [
72] to improve xApp and RIC platform. These suggestions include a smaller xApp package, support for time-series handling, support for the message queue, scaling out method, more AI technologies (online training, reinforcement learning (RL), and federated learning), and hardware acceleration.
O-RAN’s RIC consumes a non-negligible amount of resources. Additionally, O-RAN’s RIC has forward compatibility limitations because it is coupled to specific implementations, such as Redis or Prometheus databases. As a result, applications must poll these databases frequently to find new agents. The work in [
73] introduces FlexRIC to answer these problems. FlexRIC is a software development kit developed by OAI to build service-oriented controllers. It is made up of a server library and an agent library. FlexRIC has two innovative service models with proof-of-concept prototypes for RAN slicing and traffic control. Results show that FlexRIC uses 83%.
There are several existing research relating to Near-RT RIC xApps. A method that enables fault-tolerant xApps in the RIC platform, known as RIC fault-tolerant (RFT), is proposed by [
74]. The RFT is deployed so that the Near-RT RIC is more tolerant towards faults, thus preserving high scalability. Results show that the RFT can meet latency and throughput requirements as replicas increase [
74]. Another xApp called NexRAN is proposed by [
75]. NexRAN is an open-source xApp that can perform closed-loop controlled resource slicing in the O-RAN ecosystem. NexRAN is executed on O-RAN Alliance’s RIC to control the RAN slicing in srsRAN. NexRAN testing is performed in the POWDER research platform. Testing results show that NexRAN can successfully perform the RAN slicing use case. An RRM xApp based on RL is proposed by [
76]. The xApp dynamically adapts the per-flow resource allocation, modulation, and coding scheme to meet the traffic flow KPI requirements based on the network status. The xApp testing is performed on OAI LTE RAN in a small laboratory setup. It is necessary to conduct future testing to determine the performance in large-scale, heterogeneous, and non-stationary scenarios.
O-DU or the DU generally is one of the most discussed topics by wireless mobile communications enthusiasts. In
Section 5, we have mentioned that O-DU is responsible for L2 and some L1 processing. It is also noted that accelerators are included in the O-DU specifications. Back to
Section 2, we also have pointed out that through NFV, the operator can use a virtual DU (vDU). However, there is currently a problem with the DU implementation. Since vDU is deployed in a COTS server, most vDUs rely on x86 architecture. This is because most COTS servers still use the x86 architecture. The problem is that x86 architecture cannot meet the O-DU requirements under special cases like meeting the numerology 4’s 1-microsecond latency requirement and handling heavy UE traffic load. In these cases, the x86-based processing can get extra support from accelerators.
Regarding accelerators, O-RAN Alliance has already defined some specifications for the use of accelerators in the new documentation. In [
62], the terms lookaside and inline accelerators are explained. However, these specifications still lack the details needed for standardization. The method to access or utilize those accelerators, the accelerator’s API design and usage, and the definition of whether the accelerator is “dedicated or not” are not defined yet by O-RAN. Since accelerators are used in O-DU low, and Intel is the company responsible for develo** OSC’s O-DU, the only accelerator that is compatible with the O-RAN code reference in the market right now is only the field-programmable gate array (FPGA) by Intel. Nevertheless, some vendors implement the DU independent of Intel, such as using Nvidia graphics processing unit (GPU) and ** the Open RAN framework compared to the other Open RAN projects. O-RAN Alliance has also provided detailed reference design specifications, complete with very clear documentation for each reference design. Even though O-RAN has provided such clear documentation, the alliance lacks robust, deployable, and well-documented software. Most of these O-RAN frameworks can not be used in real life, in actual networks [
4]. It is caused by several reasons, such as the open source components proposed by O-RAN Alliance being incomplete, requiring additional integration, requiring further development for actual deployment, or containing components that lack robustness.
The difficulty of kee** pace with recent architecture standards is also a challenge. The cellular network community feels pressured because they have to keep updating themselves to the specifications and technologies that have just recently been developed in new communication, networking, and programming standards [
4]. The most widely known examples are the NR and the mmWave architecture, which were introduced as 5G enablers by 3GPP. Both of these architectures are currently being deployed in closed-source commercial networks. The current problem with these developments is that the RAN software libraries (OAI-RAN and srsRAN) still need to be fully developed to support NR because these developments require further coding and compatibility testing. The same thing goes for mmWave, where testing the mmWave architecture for 5G technology is still impossible because of complexity. This complexity is caused by the lack of accessible open hardware for the mmWave software to run, and it needs several beam management tests for the software to run properly.
With the problems above, MNOs face dilemmas if they migrate their current RAN assets to Open RAN deployments. These decision points are summarized by [
88]. The first question is whether MNOs should use the current Open RAN standard specifications, which still need to be completed, as we have mentioned, to open up the interfaces in vRAN or wait for the specifications to be more mature. The next question is whether MNOs should introduce Open RAN first in smaller networks before targeting macro RAN. While there is no simple answer and each operator will take different approaches, AT&T recommends that Open RAN implementation be introduced in incremental modules [
44]. MNOs can launch Open RAN modules to their architecture bit by bit before changing the whole RAN architecture to Open RAN. Survey shows that small cells will be the initial focus of the vRAN deployment by MNOs [
88]. Testing out in a localized network first before going to the main macro RAN would be a wise resolution [
44,
88].
The other question related to kee** pace with the Open RAN standard is whether MNOs should migrate their RAN assets to disaggregated and virtualized options simultaneously [
88]. The counterargument is that it is less risky to successfully disaggregate the BBU first before considering cloud deployment. While the benefits of using both disaggregation and virtualization concurrently are attractive, the survey shows that 25% of MNOs will implement them separately at first [
88].
Through openness, Open RAN has unlocked the possibility of interoperability amongst different vendors. Although economically beneficial, this opportunity poses new emerging challenges. Different vendors will use a wide range of components. It would be difficult to establish the location of bottlenecks that reduce the overall performance while using the heterogeneous components. In addition to the multiple vendor challenge, the requirement that Open RAN should also be able to interact with legacy 4G equipment will rocket the challenge even higher.
Some questions are also raised regarding the multivendor environment in Open RAN. When the RAN experiences a problem, how do MNOs decide which vendor should solve it? What is needed and how the operator decides in a multivendor environment remain open. There is also a concern about whether MNOs should directly deploy multivendor Open RAN or start working with a single vendor before reorganizing the supply chain to a multivendor environment at a future stage. Survey shows that about 40% MNOs expect to work with only one to-two established vendors at the beginning [
88].
Integration challenge is the next issue relating to Open RAN implementation. Ericsson mentioned that it concerned Open RAN’s hardware integration [
16]. With the complex multivendor ecosystem Open RAN offers, MNOs ranked integration issues as the second major obstacle with a 55% vote [
89]. We know that openness has become a fundamental pillar of Open RAN from the very start [
15], where in an OpenRAN architecture, hardware and software come from many different vendors, thus making it a multivendor architecture. Because the architecture involves components from different vendors, every service provider and mobile operator has to ensure that their networks continue to operate smoothly with all these different parties in their network [
16].
There are four different Open RAN system integration models to tackle this challenge [
16]. In the first model, MNO will do the integration, just like Rakuten and Vodafone did. However, this approach requires the MNO to possess great in-house expertise. Integration will be carried out by the service provider in the second model. This model would perform well if the service providers have extensive experience working with multiple resources. The third model assigns the integration role to the hardware or software vendor. Fujitsu shows an example of this, providing support in RU to DU integration for Dish [
16]. Even so, it would be a challenge to ask vendors to integrate on behalf of their competitors [
16].
The use of a service from a system integrator, a new actor, is the last model. The system integrator is a third party not associated with a certain hardware or software vendor, yet that works closely with the vendors to ensure their heterogeneous products work together [
16]. Telefonica has provided an example by using services from Everis, a Spanish system integrator [
16]. The appearance of this new role opens fresh business opportunities for companies who want to join the ecosystem. NEC, currently the most engaged system integrator, and others have sensibly taken this opportunity [
90].
The RAN needs a new approach to guarantee that all MNOs can operate their networks smoothly in an Open RAN. This approach should prioritize a software-driven and open-minded ecosystem from hardware vendors, software vendors, system integrators, tower companies, real estate owners, regulators, industry bodies, and MNOs. The integration of Open RAN needs to be built for a software-centric world. In this software-centric world, each software talks to all physical components to deliver scalability and innovation and change how open networks are integrated. This approach will be mostly focused on Open RAN software, where the software will make those components smarter and interoperable. This approach will also help those components be integrated and maintained remotely by using a software upgrade, so MNOs and vendors no longer need to climb network towers. To integrate Open RAN, two levels of integration need to be carried out: Open RAN ecosystem integration and Open RAN software system integration. Ecosystem integration means the system integrator will be responsible for integrating the entire architecture, including open radios and BBU software. Meanwhile, software system integration is a process that mainly focuses on COTS hardware. In this integration, Open RAN can achieve its automation by using the same DevOps tools and the same CI/CD principles, thus simplifying the integration.
Integration and interoperability are critical challenges in Open RAN. Significant effort and collaboration are needed to test the Open RAN system. Meta mentioned that a standard development environment, standard optimization metrics, and standard test and validation methodologies are required to unlock the full potential of Open RAN [
44]. It is where TIP, O-RAN Alliance, and other Open RAN standard organizations play an important role in hel** mitigate this challenge. Testing methodologies, testing centers, WGs, and plugfests have been created, as discussed in
Section 4. Future research should be undertaken in this direction. OSC offers standardization labs called OSC Community Labs [
44]. Further information on the OSC Community Lab will be explained in
Section 7.
A pipeline for designing, training, testing, and experimental evaluation of DRL-based control loops in Open RAN was proposed by [
91]. The proposed pipeline is called ColO-RAN. ColO-RAN addressed the development and adoption of DRL in real networks. This problem was caused by the unavailability of large-scale datasets and experimental testing infrastructure. ColO-RAN enables ML research using O-RAN components. The capabilities of ColO-RAN were showcased on Colosseum and Arena testbeds. Performance evaluation of ColO-RAN was performed by develo** three xApps to control the RAN slicing, scheduling, and online training of ML models. ColO-RAN and its related dataset will be publicly available to research communities worldwide.
The last issue relating to Open RAN implementation is to cross-examine Open RAN’s credibility in the promise of a multivendor environment and lower entry barriers for new vendors. Most leading Open RAN software companies have all based their products on FlexRAN, which is x86 dependent. Fortunately, the Open RAN ecosystem is expected to be more diverse. Nvidia’s BlueField-3 Arm-based data processing unit will compete against x86 processors. We also have mentioned several other competitors in the previous subsection. Amazon has already used ARM processors in some of its COTS, and other companies will presumably follow soon.
As with any other technology, the cellular network market and industry will try every possible solution that gives them better performance and lower prices. ARM claims that its server platform would provide up to 60% lower upfront infrastructure costs, up to 35% lower ongoing infrastructure costs, and up to 80% cloud infrastructure cost savings compared to traditional server [
78]. Looking at the performance difference between ARM and traditional servers, inferring they are x86-based, ARM could become more popular. Furthermore, Arm has officially joined the 5G Open RAN policy coalition. Eventually, ARM vendors such as NXP and Qualcomm would emerge. Nevertheless, ARM products presently are more custom-built and are not general processors. In addition, Arm still lacks the flexible industrial-grade virtualization capability that rivals x86.
Improving the performance and cost is the common theme of the evolution process of the RAN, as we have explained in
Section 1. This improvement process continues endlessly, including the current Open RAN era. One of the problems related to improving O-RAN and Open RAN performance is the resource scheduling process. The resource scheduling process is heavily related to network slicing. We have learned that network functionality is abstracted from its hardware and software components [
4]. This abstraction leads to the formation of slices, and each slice is given a set of functions, services, and system resources, making it an independent, dynamically created logical entity that can support UEs with multiple needs [
92]. However, each slice has been forced to carry out dynamic management due to increased demands. The management process has become critical and more challenging, especially regarding resource scheduling. Because of this reason, the slice needs new mechanisms that need to be developed, which requires a more optimized way of allocating resources.
There has been a lot of research that discusses improving resource scheduling and network slicing processes in Open RAN. Recent research by [
92] provided a new algorithm in 5G’s slices called dynamic scaling multi-slice-in-slice-connected user equipment services for system resource optimized scheduling (DMUSO) algorithm. Like its name, the DMUSO algorithm uses a concept where, in an UE, slices are connected, so this concept is called multi-slice-in-slice connected UEs. This concept connects slices to services and resources to slices, not among slices in a UE. DMUSO algorithm can improve the 5G system performance by learning the service demands, data rates, resources, bandwidths, efficiency, transmission rates, and channel-related SLAs for a user equipment’s specific level in network slicing. Results showed that DMUSO achieves efficient and optimized system resource scheduling, with significant performance gains from 4.4 times to 7.7 times compared to other methods and algorithms. Further research related to DMUSO is still open, and [
92] still has to analyze the effect of varying user equipment mobility on resource scheduling across slices in the future.
The potential significance of AI/ML in O-RAN 5G is believed to lie in its capacity to enhance system performance and improve the overall user experience. A Paper discussed the AI/ML workflow by providing details and illustrating the workflow of AI/ML in O-RAN-based 5G system architecture. The authors present a specific use case focusing on cell load prediction by creating a model through data training using the LSTM algorithm and integrated into xApps on the on the Near-RT RIC. The study shows enhanced performance in predicting the periodic traffic variations [
93]. However, the workflow implementation remiains incomplete, with training data still requiring offline processing. A comprehensive implementation of the entire workflow is essential to demonstrate the effectiveness of AI/ML methodologies in closed-loop control for optimizing network performance within the O-RAN 5G framework.
There is also a resource allocation improvement issue in Open RAN. A resource allocation optimization model that can minimize the cost of updating a RAN infrastructure is proposed by [
13]. The model allows the architecture to have a hybrid combination of different hardware and software generations by considering costs involving links, cell sites, and the capacity limit of RAN resources. Another method proposed to solve this resource allocation problem is hierarchical orchestration. Hierarchical orchestration is proposed to address over-simplified resource allocation and limited support for different network segments for the current one-size-fits-all orchestrators in E2E networks [
94]. The E2E network is divided into three segments: the RAN segment, the transport network segment, and the core network segment. Every segment has its own distributed orchestrator. These distributed specialized orchestrators enable independent management of each segment.
Hierarchical orchestration also introduces a hyperstrator, a higher-level orchestrator to coordinate the distributed orchestrators and deploy the slicing process across multiple network segments. The hyperstrator is a central point, interacting with the whole E2E network. Therefore, hyperstrator has many tasks related to its orchestrators; one of them is to ensure cohesive performance across segments and slices to guarantee consistent QoS of the network slicing. From this proposed architecture, experiment results show that the hierarchical orchestration approach can leverage the capabilities of existing orchestrators and their communities to achieve a remarkable resource allocation in E2E networks. However, this research still needs further studies. Future research should be directed towards identifying requirements, classes, and relations of ontology for hierarchical orchestration protocols. Also, future research should define a systematic translation, make a model, and evaluate and assess the new hierarchical orchestration model.
A low-complexity, closed-loop control system for Open-RAN architectures to support drone-sourced video streaming applications is proposed by [
95]. Flying drones has a higher likelihood of having line-of-sight propagation compared to UEs which can lead to performance degradation in high data rate transmission. The control system jointly optimizes the drone’s location in space and its transmission directionality to minimize its uplink interference impact on the network. The proposed system was prototyped and tested in a dedicated outdoor multi-cell RAN testbed. Numerical simulations are also used to evaluate the system. Results show that the control scheme achieves an average 19% network capacity gain compared to traditional BS-constrained control solutions. Building a 5G network that is energy efficient is also an important issue. Increased network capacity, geographical coverage, and increased traffic demands require network densification and will lead to more energy consumption. The negative impact of exhaustive energy consumption will not only degrade business profits but also impact the environment. State-of-the-art applications of ML techniques used in the 5G RAN to enable energy efficiency are reviewed by [
96]. Recent research focuses on the functional split technique for green Open RAN [
97]. An RL-based dynamic function splitting (RLDFS) technique that decides on the dynamic function splits among DUs and the CU in an Open RAN system to make the best use of renewable energy source (RES) supply and minimize operator costs was proposed. Performance evaluation was performed using a real data set of solar irradiation and traffic rate variations. Results showed that the proposed RLDFS method effectively uses renewable energy and is cost-efficient.
Other researchers also proposed F-RAN, or fog-computing-based RAN to improve RAN’s performance and cost. This is another type of RAN, which its architecture is based on fog computing. Fog computing is a term for an alternative to cloud computing that puts a substantial amount of storage, communication, control, configuration, measurement, and management at the edge of a network [
98]. This fog computing can be applied to C-RAN to alleviate the constraints of FH and high computing capabilities in BBU pool [
10]. F-RAN has several advantages, such as rapid and affordable scaling. This makes F-RAN architecture much more adaptive to the dynamic traffic and radio environment. Even though F-RAN is considered an excellent solution for covering C-RAN’s weaknesses, it still has some challenges and open problems
Another method researchers propose to improve RAN’s performance and cost is Air-Ground Integrated Mobile Edge Networks (AGMEN). AGMEN is proposed by [
99] to integrate UAV-assisted network densification. In AGMEN, multiple drone cells are deployed flexibly to provide agile RAN coverage for the temporally and spatially changing users and data traffic. There are also several critical components in AGMEN, such as multi-access RAN with drone cells and UAV-assisted edge caching. However, AGMEN still has many challenges related to the heterogeneity and dynamic nature of architectural devices. Further research for AGMEN relates to mobile routing, multi-dimension channels, and UAV scheduling.
The use of Digital Twin(DT) has the potential to enhance the principles of intelligence, autonomy, and openness within an O-RAN-based system. As detailed in a study by [
100], this application of DT technology is used to explore possible scenarios and train AI/ML models that can address all potential corner cases in a real-world setting. It will play a crucial role in achieving the predefined 6G KPIs.
The other method proposed by [
101] to improve RAN’s performance and cost is PlaceRAN. PlaceRAN focuses on minimizing computing resources and maximizing the aggregation of radio functions to optimize the placement of radio functions in virtual NG-RAN planning. PlaceRAN uses a disaggregated RAN combination (DRC) concept and multi-stage problem formulation, thus enabling the management of units and protocols as a set of radio functions. The result achieved by [
101] showed that PlaceRAN can reduce the number of DRCs in the network, by taking functional split options, one-way tolerated latency, cross-haul bandwidth, and a bandwidth derived from an analysis conducted in a project mentioned in the study.
Secure Open RAN is the next popular challenge that is widely discussed. Every network connectivity should be deployed as securely as possible, and so does the Open RAN and the 5G network. As users of network connectivity, we surely do not want to experience any fraud or data stealing that will threaten our safety. Therefore, a secure Open RAN should be seen as a big challenge, which should be discussed in further research.
Open RAN promotes the advantages of its disaggregation and interoperable pillars. On the other hand, these pillars also introduce new challenges regarding security. Disaggregation and virtualization are the themes of Open RAN. However, decoupling the software from the hardware expanded the security threat surface [
102]. Major virtualization technologies are vulnerable to security attacks, including MEC, SDN, NFV, network slicing, and cloud [
103]. O-RAN Alliance’s SFG has noticed that decoupling, containerization, and virtualization vulnerabilities can be exploited by threat actors [
104]. Other works have also mentioned some of the vulnerabilities relating to disaggregation, such as insufficient identity, credential, and access management; insecure interfaces and APIs; system vulnerability; and shared technologies vulnerabilities [
4,
103]. The interoperable pillar of Open RAN also introduces new security challenges. One of them is that different vendors might use different degrees of security in their products [
7]. Although vendors should practice best security practices, not all will implement adequate secure management interfaces.
The same case happens with the standards developed by the O-RAN Alliance. The O-RAN Alliance has two fundamental pillars: openness and intelligence. Each pillar introduces new security challenges. Additional interfaces and functions in O-RAN architecture add the area of threat surface [
6,
7,
104]. Incorporating AI/ML in O-RAN architecture, known as the intelligence pillar, also adds a new surface threat targeting AI/ML-related functions in O-RAN. Other works have also discussed the new security challenges of using AI/ML in mobile networks [
105,
106]. Another additional threat surface comes from using open-source code. Open-source software gives both advantages and disadvantages regarding security [
4,
107]. However, extra security measures should still be taken because adversaries could easily access the same open-source code used in the O-RAN system and exploit its vulnerabilities [
7,
107].
As explained in
Section 3, O-RAN Alliance is working carefully to address the security challenges with its SFG. SFG specifies the security requirements in the O-RAN system in [
108,
109]. Hopefully, the O-RAN Alliance can develop a secure architecture, framework, and guidelines for its Open RAN standards. Other references have also discussed some defense mechanisms to increase software security in addition to these requirements and protocols, such as authentication and access control, cryptography, secure virtualization, anonymity and obfuscation, resilience assurance, lightweight security based on the physical layer, and intrusion detection mechanisms [
103,
106,
110,
111]. SFG has also made modeling and analysis of security threats in the O-RAN architecture in [
104]. SFG summarizes the threat surfaces, agents, potential vulnerabilities, and threats. Some parts of this analysis are also discussed by [
112]. SFG specifies a total of 49 threats, divided into seven categories: threats to the O-RAN system, O-Cloud, open-source code, physical, 5G radio networks, and ML system threats.
Focusing on the O-RAN system, there are 35 threats [
104]. We will mention some of the main threats for each O-RAN component. A common threat to all the O-RAN components is the exploitation of insecure design, weak authentication, and weak access control in O-RAN components or network boundaries to compromise O-RAN components’ integrity and availability. In the FH, the attacker could penetrate O-DU and beyond by accessing the O-FHI or targeting one or more planes in O-FHI. In O-RU, an attacker could set up a rogue O-RU. In Near-RT RIC, attackers could exploit xApps vulnerabilities or create malicious xApps. In Non-RT RIC, attackers could exploit rApps vulnerabilities or penetrate the Non-RT RIC to cause denial of service (DoS). Overload DoS attacks could also target SMOs.
O-RAN TIFG has also provided some E2E Security Test Specification in [
113]. From an E2E perspective, the O-RAN system is just another gNB. Therefore, any O-RAN standard adopter must follow security requirements, threats, and test cases outlined in 3GPP TS 33.511 for gNB. Other than that, TIFG also specifies optional test cases for S-Plane, C-Plane, and Near-RT RIC A1 interface and O-Cloud. These test cases cover DoS, fuzzing, blind exploitation, and resource exhaustion attacks. Other references have identified some of these threats and attacks, such as data breaches, hijacking attacks, malicious insiders, data loss, DoS and distributed DoS (DDoS), abuse and nefarious use of services or resources, and caching attacks [
103,
105,
110,
114].
Another method proposed to improve the network security is deploying zero trust architecture (ZTA) [
102,
115,
116]. Integrating ZTA to 5G and 6G technologies has become critical in tactical and commercial applications [
116]. ZTA is a solution to address security requirements in a network with unreliable infrastructure. ZTA has a dynamic risk assessment and trust evaluation as its key elements. Intelligent ZTA (i-ZTA), an AI-embedded ZTA, is proposed by [
116] to provide secured information in unreliable network infrastructure. ZTA can provide necessary computational resources and seamless, reliable, and robust connectivity.
Besides ZTA, another method proposed to deploy a secure Open RAN is called blockchain (BC). BC is a network system that can establish transactional faith among peer entities on decentralized peer-to-peer (P2P) platforms while overcoming the vulnerability of centralized ledger host [
117]. BC has also emerged as a potential tool in designing a self-managed and scalable decentralized network [
110,
117]. The BC technology has been discussed by many researchers in Open RAN along with its various potential application scenarios, including IoT, MEC, smart grid, vehicular network, and smart cities. BC can offer pseudonymity, one of the defense mechanisms to keep its users’ privacy [
118]. There are several main benefits if we integrated BC to Open RAN and its applications; one of them is reduced communication cost and reduced delay required to establish agreements facilitating a real dynamicity in RAN sharing [
110,
117,
118]. However, incorporating BC into RAN still needs to be further studied. Future research should mainly focus on improving the latency, stability, and scalability of BC, develo** a green mining mechanism for power-limited node devices, and finding ways to lower the cost of incorporating BC into RAN technology [
110,
117,
118].
The following paragraphs will discuss other Open Issues relating to the Open RAN field. A lot of challenges and future research directions for Open RAN and O-RAN have been covered in the previous paragraphs. However, issues related to Open RAN still do not belong in previous categories. This last part will discuss those challenges and future directions.
As written in the previous implementation and standardization problems paragraphs, we can see that one challenge that needs to be researched deeply is the lack of accessible open hardware for software to run. The limited contributions to RAN open-source software intensify this issue. It seems like some more MNOs and vendors are trying to develop open-source CN and MANO frameworks, such as OMEC [
119], Magma [
120], Free5GC [
121], and Open5GS [
122]. At the same time, they are not trying to develop RAN-related project frameworks, like OAI-RAN or srsRAN. RAN development is mainly carried out by researchers and small companies with limited human resources and resources [
4].
This limitation should be addressed because limited contributions result in limited feedback for the Open RAN standard develo** organizations. It poses a severe problem since the organizations cannot verify the viability of the standard that has been developed. The major vendors and MNOs should be more involved in develo** RAN-related projects because the RAN stack’s lower layers have become increasingly important each day. These lower layers have become sources of telecommunication businesses’ intellectual property and product-bearing revenues.
Related to this contribution problem is the different licensing models for 3GPP technologies. Different licensing models exist for 4G/5G technologies software, as shown in
Figure 19. As mentioned, most RAN software, such as Amarisoft, Nokia, and Ericsson software, is closed-source, commercial, or copyrighted. Open-source software itself can be divided into permissive and copyleft licenses. Both allow developers to copy, modify, and redistribute code freely on open-source or commercial products. However, a copyleft license obligates the developer to open their altered source code under open source so it can be publicly available, such as the GNU General Public License (GPL), GNU Lesser General Public License (LGPL), GNU Affero General Public License (AGPL), and Mozilla Public License (MPL). This requirement is undesirable for companies using the source code because they must share their trade secrets. A permissive license does not require developers to release their source code using permissive licensed open-source software publicly. However, some permissive licenses do not retain the intellectual property rights. This creates potential future problems for companies using open-source software for their commercial products. Examples of this type of license are Berkeley Software Distribution (BSD), Massachusetts Institute of Technology (MIT), and Apache v1.1 License. Apache License v2.0 is an example of another type of permissive license that retains intellectual property rights.
These different licensing models show that companies can suffer a loss when using the wrong open-source software. Tackling this obstacle, OAI has its software license model to take OAI-RAN to be licensed under OAI’s public license, thus allowing their users to use their open-source software easily. The OAI public license allows parties to contribute to the source code while retaining their intellectual property rights. It effectively incentivizes the community to use OAI as a reference implementation in their research and development or productization.
Responding to this problem, O-RAN Alliance has also made its initial step to deploy, open, and softwarize RAN using Apache License v2.0. Through the initial step that O-RAN Alliance took, other wireless communities, MNOs, and vendors can follow O-RAN’s path by increasing their support toward develo** complete and truly Open RAN and open-source RAN-related projects. Fortunately, more organizations are now expected to contribute to Open RAN. IEEE Standard Association has initiated an Open RAN Industry Connections Activity to help the existing Open RAN industry efforts [
123]. This program will accelerate collaboration between organizations and individuals in the Open RAN field.
The next open issue related to RAN is the latency issue. Rakuten’s Open RAN deployment in Japan is a real-life example of this challenge. Even though the overall performance of Rakuten’s Open RAN mobile network is rated “very good”, the latency score shows room for improvement relative to major MNOs in assessed cities [
21]. The latency problem is more severe in URLLC network slices that require low latency. Using vBBU could result in higher latency [
71]. The following research should explore backup strategies for several use case scenarios.
A gradient-based scheme is proposed by [
124] to overcome the latency problem. This scheme solves Open RAN’s minimum delay function placement and resource allocation. There are many layers of RAN, and Open RAN allows these layers to be split and deployed as virtual functions and openly communicate with each other layer for service provisioning. An E2E mobile operator employing Open RAN is modeled by [
124], with a hierarchical mobile network architecture consisting of local, regional, and core data center layers. These three hierarchical layers add flexibility in resource allocation and increase reliability while taking advantage of Open RAN. In addition, the case where the traffic of a service function requests (SFRs) traverses multiple chains via a single path through containerized network functions (CNFs) is also modeled. This formulation proposed a gradient-based solution to achieve the optimal solution efficiently. This solution is applied through the gradient-based minimum delay (GBMD) algorithm. This algorithm can serve up to 90% E2E network delay decrease. Another method proposed by [
125] can also be used to reduce latency for cross-service interactions in the network. The proposed method focuses on network slicing and is called optimized edge slice orchestration (MESON). This method is a combination of optimized cross-slice communication (CSC) in network-level slicing, MANO, and edge computing. MESON fosters cross-slice or tenant interactions, and provides the necessary means for the establishment of cross-service interactions, thus exploiting opportunities for providing co-location service. The research has shown that the main operations in the MESON layer can lower the delay of the service response time, with a delay of less than 40 ms, with points of presence (PoPs) of at least 100, thus reducing network latency.
Another issue that still exists in Open RAN is the issue of scalability. We already know that Open RAN is more scalable than its predecessors, thanks to NFV, SDN, and automation, which made the RAN more scalable and flexible in its management and orchestration [
4]. Open RAN’s framework relies heavily on AI-embedded software to maintain its scalability and flexibility [
126]. RIC is also responsible for making Open RAN scalable. Both Non-RT and Near-RT RIC have made the RAN more scalable and adaptable. The Near-RT RIC provides a secure and scalable platform, thus making it possible to control its xApps flexibly [
39]. All of these are undertaken to make Open RAN more scalable, leading to a more effective RAN, especially in terms of its performance.
However, the scalability in Open RAN still needs to be improved. There are several concerns regarding the Open RAN scalability [
22,
89]:
Operators wanting to adopt the cloud-native Open RAN need to expect the cloud scaling challenges [
22]. Scaling the DU virtualization across servers will be a real-time problem. Additional scaling constraints will emerge if MNOs consider the use of accelerators. Furthermore, containerized orchestration alone cannot solve the high network availability and reliability operational goals. The applications will need additional built-in state synchronization and data integrity consideration. Moreover, specific failover and availability mechanisms will be required at the protocol level.
Most Open RAN enthusiasts assume that the architecture will bring significant economic savings while ignoring the cost of operating the complex architecture [
89]. We have explained that a variety of new business roles are involved in the Open RAN ecosystem, such as the system integrator. The question is whether the service expense for these roles and OpEx in a multivendor architecture at a large scale will exceed the cost-saving promised. The total TCO with large-scale operations is still undetermined [
22].
The gap between scalable automation and AI capability is another challenge to be considered [
22]. The implementation of AI/ML technologies is relatively new in the telecommunication context. Using AI/ML in telecommunication grade operations will require significant resource expense. While interface standardization is currently available, data access, pipelines, and validation cannot be fully scaled due to a lack of standardized network configuration and performance data exchange. Furthermore, large-scale AI/ML asset deployment and operations by MNOs in live networks are still rare and complicated. To successfully scale and operate AI/ML use cases, harmonization of expertise from telecommunication, data science, and cloud/big-data fields is required [
22].
OAI provides the Trirematics operator as a solution to answer the scalability challenge. Trirematics is an intelligent software operator for RAN and CN deployment scenarios [
127]. Trirematics’ orchestration and management framework is cloud-native and extensible. Trirematics include intelligence, agility, automation, abstraction, maintenance, and observability. Trirematics makes intelligent and agile decisions in deploying and operating the E2E network. It automates the lifecycle of the network entities and abstracts complexities in a 5G environment while providing enough diagnosis and control for its users. It also has extensive observability features, such as log processing, alerting, metric processing, and health monitoring.
Rakuten Mobile’s commercial deployment in Japan proves that Open RAN is ready for prime time. Dell’Oro Group’s research report states that Open RAN is expected to acquire more than 10% of the RAN market share by 2025 [
128]. While the majority respond enthusiastically, [
5] view this result as proof to doubt the Open RAN adoption by the market. Whether these data and predictions are viewed positively or negatively is in the hands of each impacted party. Huawei clearly states that they do not support Open RAN or vRAN. Open RAN possesses many challenges, such as standardization, OpEx, and security issues that we have explained above. In addition, there might be political reasons involved behind the Open RAN movement. This notion was researched, and Open RAN was considered a geopolitical hijacking case [
129]. By considering Open RAN as a social imaginary of openness and trustworthiness, Open RAN can be weaponized to outcompete rivals by parties ranging from industry to government. Whether this is true or false is still debatable; hopefully, the truth will shine as time progresses.
Whichever the case, AI/ML’s implementation will play an important role in the future of the cellular network. The 5G technology challenges show that more AI/ML is required in future network technology. O-RAN Alliance is applying this by building ML directly into the network architecture through RIC. We have also mentioned how researchers generally use ML to solve problems in O-RAN architecture and Open RAN. Several additional benefits of implementing AI/ML into 5G communication systems are mentioned in [
130]. First, AI/ML can help with problems related to the complicated and dynamic wireless communications channel. AI can help in channel measurement data processing, channel modeling, and channel estimation. AI can also help in physical layer research of the 5G network. Massive MIMO arrays create enormous volumes of data well suited to be analyzed by ML systems. ML can also improve the performance in signal processing. Data-driven localization using ML algorithms in 5G systems can also be a valuable application. AI can also improve network management and optimization in 5G systems. Model-based optimization can be replaced with data-driven optimization in ultra-dense networks. AI/ML technologies also enable a paradigm shift of wireless networking from reactive to proactive design.
However, Ref. [
130] also mentions some limitations of the current AI/ML technologies that prevent them from directly applying to the current 5G system. For example, ML algorithms are primarily developed for systems and applications that do not require high-frequency performance. On the other hand, a 5G network requires high data processing rates for URLLC. Existing ML algorithms also rely on powerful processing hardware. On the contrary, communication systems are full of devices constrained by storage capabilities, computational power, and energy resources. Adjustments and innovations in AI technology should be carried out before it can be fully compatible with the needs of the 5G network. One interesting research direction is to develop distributed ML algorithms for 5G networks. Previously, we have mentioned how MEC and fog computing could be solutions to improve the performance of Open RAN and 5G networks. The main idea is to move from centralized to distributed systems. Parallel to this trend, ML algorithms in communications applications should also move from centralized to distributed. For example, a lightweight deep learning model can support cloud, fog, and edge computing networks. Furthermore, parallel decentralized and centralized algorithms could be used while balancing complexity, latency, and reliability.
The 5G technology system is required to meet users’ demands worldwide by delivering connectivity anytime and anywhere [
131]. Additionally, many smart devices and advanced technologies have been increasing rapidly, and these have also led to the increasing need for 5G connectivity systems. The 5G network is also demanded to fulfill multiple requirements. The development of optical wireless communications has been rapidly increasing because it could be a potential solution to support high data rates [
132].
Moreover, 5G technology has also been required to provide services not only in big cities and metropolitan areas but also in remote and terrestrial areas—the areas that are uncovered or under-served geographically. A Non-Terrestrial Network (NTN) can be an effective solution for 5G to provide omnipresent services by achieving global network coverage [
131]. NTN is networks, or segments of networks, that use an airborne or space-borne vehicle for transmission [
133]. 3GPP has also carried out several studies and activities to support NTN as part of 5G technology, especially 5G NR architecture. NTN was explicitly introduced by 3GPP in Rel-17 [
56]. The NTN in 5G NR is still develo**, with a future guarantee from 3GPP that the 5G network can be operated in frequencies more than 52 GHz. We will know more about this guarantee in Rel-18.
The 3GPP standardization has already completed the first 5G NR standard, specifically in Rel-15 [
56,
131]. They have also started to work on several solutions to support NTN in 5G NR systems [
131]. RRM has become one of the major issues in 5G NR, and this management strategy is relevant to offering tight cooperation between satellite and terrestrial networks. Through RRM, researchers are finding efficient resource allocation methods to decrease interference and provide real-time video services by implementing effective link adaptation procedures. Meanwhile, mobility management is also important to maintain the continuity of network services by achieving seamless handover over heterogeneous wireless access networks. The novelty of 5G architecture embedded with NTN is in the form of SDN and NFV-based-NTN; internet of space things (IoST), cognitive NTN; and non-orthogonal multiple access (NOMA)-based NTN [
131].