Browsing by Author "Nikolic, Borislav"
Now showing 1 - 10 of 17
Results Per Page
Sort Options
- Application mapping in NoC-Based many-coresPublication . Nikolic, Borislav; Petters, Stefan M.Many-core platforms based on Network-on-Chip (NoC [Benini and De Micheli 2002]) present an emerging technology in the real-time embedded domain. Although the idea to group the applications previously executed on separated single-core devices, and accommodate them on an individual many-core chip offers various options for power savings, cost reductions and contributes to the overall system flexibility, its implementation is a non-trivial task. In this paper we address the issue of application mapping onto a NoCbased many-core platform when considering fundamentals and trends of current many-core operating systems, specifically, we elaborate on a limited migrative application model encompassing a message-passing paradigm as a communication primitive. As the main contribution, we formulate the problem of real-time application mapping, and propose a three-stage process to efficiently solve it. Through analysis it is assured that derived solutions guarantee the fulfilment of posed time constraints regarding worst-case communication latencies, and at the same time provide an environment to perform load balancing for e.g. thermal, energy, fault tolerance or performance reasons.We also propose several constraints regarding the topological structure of the application mapping, as well as the inter- and intra-application communication patterns, which efficiently solve the issues of pessimism and/or intractability when performing the analysis.
- Buffer-aware bounds to multi-point progressive blocking in priority-preemptive NoCsPublication . Indrusiak, Leandro; Burns, Alan; Nikolic, BorislavThis paper aims to reduce the pessimism of the analysis of the multi-point progressive blocking (MPB) problem in real-time priority-preemptive wormhole networks-on-chip. It shows that the amount of buffering on each network node can influence the worst-case interference that packets can suffer along their routes, and it proposes a novel analytical model that can quantify such interference as a function of the buffer size. It shows that, perhaps counter-intuitively, smaller buffers can result in lower upper-bounds on interference and thus improved schedulability. Didactic examples and large-scale experiments provide evidence of the strength of the proposed approach.
- Comparing the schedulers and power saving strategies with SPARTSPublication . Awan, Muhammad Ali; Nikolic, Borislav; Petters, Stefan M.We have developed SPARTS, a simulator of a generic embedded real-time device. It is designed to be extensible to accommodate different task properties, scheduling algorithms and/or hardware models for the wide variety of applications. SPARTS was developed to help the community investigate the behaviour of the real-time embedded systems and to quantify the associated constraints/overheads.
- Contention-Free Execution of Automotive Applications on a Clustered Many-Core PlatformPublication . Becker, Matthias; Dasari, Dakshina; Nikolic, Borislav; Åkesson, Benny; Nelis, Vincent; Nolte, ThomasNext generations of compute-intensive real-time applications in automotive systems will require more powerful computing platforms. One promising power-efficient solution for such applications is to use clustered many-core architectures. However, ensuring that real-time requirements are satisfied in the presence of contention in shared resources, such as memories, remains an open issue. This work presents a novel contention-free execution framework to execute automotive applications on such platforms. Privatization of memory banks together with defined access phases to shared memory resources is the backbone of the framework. An Integer Linear Programming (ILP) formulation is presented to find the optimal time-triggered schedule for the on-core execution as well as for the access to shared memory. Additionally a heuristic solution is presented that generates the schedule in a fraction of the time required by the ILP. Extensive evaluations show that the proposed heuristic performs only 0.5% away from the optimal solution while it outperforms a baseline heuristic by 67%. The applicability of the approach to industrially sized problems is demonstrated in a case study of a software for Engine Management Systems.
- Extensive Analysis of a Real-Time Dense Wired Sensor Network Based on Traffic ShapingPublication . Loureiro, João; Rangarajan, Raghuraman; Nikolic, Borislav; Soares Indrusiak, Leandro; Tovar, EduardoXDense is a novel wired 2D mesh grid sensor network system for application scenarios that benefit from densely deployed sensing (e.g., thousands of sensors per square meter). It was conceived for cyber-physical systems that require real-time sensing and actuation, like active flow control on aircraft wing surfaces. XDense communication and distributed processing capabilities are designed to enable complex feature extraction within bounded time and in a responsive manner. In this article, we tackle the issue of deterministic behavior of XDense. We present a methodology that uses traffic-shaping heuristics to guarantee bounded communication delays and the fulfillment of memory requirements. We evaluate the model for varied network configurations and workload, and present a comparative performance analysis in terms of link utilization, queue size, and execution time. With the proposed traffic-shaping heuristics, we endow XDense with the capabilities required for real-time applications
- Hard real-time multiprocessor scheduling resilient to core failuresPublication . Nikolic, Borislav; Bletsas, Konstantinos; M. Petters, StefanMost multiprocessor scheduling theory overlooks the possibility of hardware failures that entirely nullify the computation carried out by a task instance, and potentially also make the respective processor henceforth unusable. Yet, such failures may occur, causing the system to fail. Motivated by this reality, we introduce a new concept of hard real-time schedulability guarantees for critical multiprocessor systems and analysis for their derivation. Namely, all deadlines must be met, even in the event of a core failure. A scheduling approach, based on global fixed priorities, and accompanying analysis, for achieving such guarantees are then formulated.
- NoC contention analysis using a branch-and-prune algorithmPublication . Dasari, Dakshina; Nikolic, Borislav; Nelis, Vincent; Petters, Stefan M.“Many-core” systems based on a Network-on-Chip (NoC) architecture offer various opportunities in terms of performance and computing capabilities, but at the same time they pose many challenges for the deployment of real-time systems, which must fulfill specific timing requirements at runtime. It is therefore essential to identify, at design time, the parameters that have an impact on the execution time of the tasks deployed on these systems and the upper bounds on the other key parameters. The focus of this work is to determine an upper bound on the traversal time of a packet when it is transmitted over the NoC infrastructure. Towards this aim, we first identify and explore some limitations in the existing recursive-calculus-based approaches to compute the Worst-Case Traversal Time (WCTT) of a packet. Then, we extend the existing model by integrating the characteristics of the tasks that generate the packets. For this extended model, we propose an algorithm called “Branch and Prune” (BP). Our proposed method provides tighter and safe estimates than the existing recursive-calculus-based approaches. Finally, we introduce a more general approach, namely “Branch, Prune and Collapse” (BPC) which offers a configurable parameter that provides a flexible trade-off between the computational complexity and the tightness of the computed estimate. The recursive-calculus methods and BP present two special cases of BPC when a trade-off parameter is 1 or ∞, respectively. Through simulations, we analyze this trade-off, reason about the implications of certain choices, and also provide some case studies to observe the impact of task parameters on the WCTT estimates.
- On Routing Flexibility of Wormhole-Switched Priority-Preemptive NoCsPublication . Nikolic, Borislav; Pinho, Luís Miguel; Indrusiak, LeandroFlit-level preemptions via virtual channels have been proposed as one viable method to implement prioritypreemptive arbitration policies in NoC routers, and integrate NoCs in the hard real-time domain. In recent years, researchers have explored several aspects of priority-preemptive NoCs, such as different arbitration techniques, different priority assignment methods (where applicable) and different workload mapping approaches, all with the common objective to use interconnect mediums more efficiently. Yet, the impact of different routing techniques on such a model is still an unexplored topic. Motivated by this reality, in this work we study the effects of routing flexibility on wormhole-switched priority-preemptive NoCs.
- Optimal Minimal Routing and Priority Assignment for Priority-Preemptive Real-Time NoCsPublication . Nikolic, Borislav; Pinho, Luís MiguelThe Network-on-Chip (NoC) architecture is an interconnect network with a good performance and scalability potential. Thus, it comes as no surprise that NoCs are among the most popular interconnect mediums in nowadays available many-core platforms. Over the years, the real-time community has been attempting to make NoCs amenable to the real-time analysis. One such approach advocates to employ virtual channels. Virtual channels are hardware resources that can be used as an infrastructure to facilitate flit-level preemptions between communication traffic flows. This gives the possibility to implement priority-preemptive arbitration policies in routers, which is a promising step towards deriving real-time guarantees for NoC traffic. So far, various aspects of priority-preemptive NoCs were studied, such as arbitration, priority assignment, routing, and workload mapping. Due to a potentially large solution space, the majority of available techniques are heuristic-centric, that is, either pure heuristics, or heuristic-based search strategies are used. Such approaches may lead to an inefficient use of hardware resources, and may cause a resource over-provisioning as well as unnecessarily high design-cost expenses. Motivated by this reality, we take a different approach, and propose an integer linear program to solve the problems of priority assignment and routing of NoC traffic. The proposed method finds optimal routes and priorities, but also allows to reduce the search space (and the computation time) by fixing either priorities or routes, and derive optimal values for remaining parameters. This framework is used to experimentally evaluate both the scalability of the proposed method, as well as the efficiency of existing priority assignment and routing techniques.
- Partitioning and Analysis of the Network-on-Chip on a COTS Many-Core PlatformPublication . Becker, Matthias; Nikolic, Borislav; Dasari, Dakshina; Åkesson, Benny; Nélis, Vincent; Behnam, Moris; Nolte, ThomasMany-core processors can provide the computational power required by future complex embedded systems. However, their adoption is not trivial, since several sources of interference on COTS many-core platforms have adverse effects on the resulting performance. One main source of performance degradation is the contention on the Network-on-Chip, which is used for communication among the compute cores via the offchip memory. Available analysis techniques for the traversal time of messages on the NoC do not consider many of the architectural features found on COTS platforms. In this work, we target a state-of-the-art many-core processor, the Kalray MPPA. A novel partitioning strategy for reducing the contention on the NoC is proposed. Further, we present an analysis technique dedicated to the proposed partitioning strategy, which considers all architectural features of the COTS NoC. Additionally, it is shown how to configure the parameters for flow-regulation on the NoC, such that the Worst-Case Traversal Time (WCTT) is minimal and buffers never overflow. The benefits of our approach are evaluated based on extensive experiments that show that contention is significantly reduced compared to the unconstrained case, while the proposed analysis outperforms a state-of-the-art analysis for the same platform. An industrial case study shows the tightness of the proposed analysis.