info@biomedres.us   +1 (502) 904-2126   One Westbrook Corporate Center, Suite 300, Westchester, IL 60154, USA   Site Map
ISSN: 2574 -1241

Impact Factor : 0.548

  Submit Manuscript

Mini ReviewOpen Access

Japanese Strategic Supercomputer “Arrow of Time” Volume 45- Issue 2

Andrey Molyakov*

  • Institute of information technologies and cybersecurity, Russian State University for the Humanities, Russia

Received: July 07, 2022;   Published: July 25, 2022

*Corresponding author: Andrey Molyakov, Institute of information technologies and cybersecurity, Russian State University for the Humanities, Russia

DOI: 10.26717/BJSTR.2022.45.007183

Abstract PDF

ANNOTATION

This article describes a new Japanese strategic supercomputer. Japanese specialists are developing a military supercomputer “Arrow of Time” (codename of development work also is “Arrow of Time”) for the supercomputer center under construction in the northern part of the island of Honshu, in the mountains south of the city of Hirasaki. The main contractor of the work is the Technical Research & Development Institute, which is part of the Japan Self-Defense Forces. The development involves Fujitsu, Hamamatsu Photonics, as well as civil collaborators – RIKEN (previously developed the K-computer) and the Tokyo Institute of Technology (previously developed Tsubame 2.0).

Keywords: RIKEN; Codename “Arrow of Time”; Specialized Hybrid Microprocessors; Tsubame

Introduction

Tsubame and K-computer projects in the advancement to the supercomputer “Time Arrow” are considered as respectively the initial stage (“morning swallow”, the beginning of flight) and the stage of the flourishing of civilization (the result of the development of the 1000th anniversary). The project “Arrow of Time” is figuratively considered a breakthrough “for millions of years”, i.e. this is the beginning of the introduction of breakthrough technologies of the future [1,2]. The development of this supercomputer and, in general, the development of supercomputers for the commercial and special services, including airborne ones, are being carried out by the National Defense Academy and the Technical Research & Development Institute. These two organizations are comparable to China’s NUDT (National University of Defense Technology) and are controlled by the General Staff of the Japanese Self-Defense Forces. The Japanese DARPA [3] is located in the same structures.

Principal Architectural and Technological Features of the Project “Arrow of Time”

The principal features of the computational node of the supercomputer “Arrow of Time” are as follows:

1. In the center of the computing board there is a hybrid microprocessor, which contains: a network processor of a network of computing nodes; network processor of the inputoutput subsystem; a massively multithread microprocessor with medium-complexity cores and asynchronous multithreading. This microprocessor is connected to the main memory of the DRAM type of the computing node, as well as memory of the NVRAM type (non-volatile memory, it can be considered as a super-fast disk). This microprocessor, which will be further referred to as the network processor built into the memory, performs, in addition to the actual network operations, the following functions: it performs translation of virtual addresses; after address translation, separates calls to the local memory of the node from calls to remote nodes; performs aggregation of calls to remote nodes, as well as unpacking incoming messages from remote nodes; performs functions localized in the immediate vicinity of memory, which can be atomic operations, transactional transformations of data structures, complex synchronization functions such as actions of special synchronization nodes or data transformation (actors); performing actions to extract data and download it back within the framework of models with separation of data access and processing, etc. This is the most innovative part of the computing node, it has attracted the main attention and efforts of developers. It must be said that the presence of such a block is not a special surprise [4,5].

The computing unit also includes processing microprocessors. Such microprocessors of two types were mentioned - for computational tasks, when the share of computational operations is high (localization and intensity of work with data can be different), as well as for non-computational tasks, when logical processing is carried out and, as a rule, these tasks are characterized by poor space-time localization of memory accesses, high intensity of memory operations. These microprocessors are connected in a 3D assembly DRAM-memory, possibly also NVRAM-memory. This can be thought of as super-fast memory above the L3 cache. The microprocessor of counting tasks contains the types of processor cores traditionally used in this field: superscalar cores; GPU cores (asynchronous WARP threads and synchronous threads within each WARP); lightweight kernels of the type used in the BG / Q supercomputer, which use powerful devices for processing short SIMD vectors with a processing width of 512 bits. It is essential that all these kernels, different types of kernels, work on a common memory field, this is the kind that AMD has now begun to produce - Fusion APU. However, Japanese developers have a greater variety of kernels. Apparently, hardware virtualization tools will be developed in this microprocessor since it is of different types and it may not be possible to work for all types of cores in one task.

Figure 1: Assembly 3D-constructive module.

biomedres-openaccess-journal-bjstr

The non-computational processor may be similar to the Thread storm microprocessor in the Cray XMT supercomputer, but multi-core, with more memory controllers and a powerful network interface. Thus, for non-computational tasks, two-level multithreading will be applied, one in the processing processor, and the other directly in the memory modules. It should be noted the applied element-design technologies. According to the information available so far, the main feature is nanotubes, which use hightemperature superconductivity, it is provided at a temperature of 20 degrees Celsius. Japanese developers have achieved significant success [6,7]. While conventional technology can transmit signals over connections at a consumption of 100-150 fJ / bit / mm, nanotubes can transmit a signal with a consumption of 0.6-1 fJ / bit / mm. Another feature is that the information transmission technology recently discovered by IBM, conventionally called Holley, will be applied, using low-power lasers, photodetectors, optical channels in silicon and arrays of micro lenses. The same technology is planned for use in a supercomputer for the US NSA. It is also known about the successes of Japanese specialists in the field of terabit networks with the compression of information transmission over optics at different wavelengths. It was reported that the level of simultaneous use of 40 wavelengths in the channel has been reached. You can see the constructive module of the newest supercomputer (Figure 1). It can be seen that one structural module contains several computational nodes. This was an expected decision, similar to the decision in the Chinese supercomputer СT-2 / СT-3. The following is a translation of the inscriptions shown in the diagram (in Japanese).

Comments on the Drawing of the Assembly Atructure

1. Platform for installing a crystal of a multi-core microprocessor (about the type of cores, see comment 2 below) These pads are located on the “microboard”, the figure shows a microboard with four such pads. 3D assembly is installed on a microboard at each site. The assembly may include DRAM modules. In general, a microboard is analogous to what is now called a “socket” in modern server boards. Each 3D assembly comes with an individual cooling system

2. The processor cores of microprocessors can be different in the type of processes supported in the kernel for executing instructions of one or many processes-threads - ultra-light, light, medium-heavy, heavy. For example, ultra-light graphics processing unit (GPU) cores, mid-range MIPS cores for RISC microprocessors, “heavy” cores of superscalar microprocessors with x86 architecture;

3. Each 3D assembly has a master microprocessor and subordinates to it, the “master-slave” mode of operation is provided;

4. Not only 3D assembly can be used, but also 3D VLSI. The system uses tools for working with very large data (large amounts of data from the system’s multi-level memory), as well as dedicated highways for working with data transmission networks (sensors through which data streams come). It is possible to organize data processing using synchronous and asynchronous threads, as well as using vectors of different lengths;

5. This backbone can transmit data if there are no active messages - parsels for managing and performing computations on remote nodes in order to localize computations on the data located in these nodes (see also comment 18);

6. In the assembly structure shown in the figure, external connections are made in the form of contactless pinless optical connections (IBM Holley-chip technology with lowpower lasers and microlens arrays, see comment 16). These connections are used by Network Processors (see note 10) and I/O Processors (see note 11). The random-access memory in the assembly is implemented by a plurality of 3D DRAM modules that are connected to the massively multithreaded memory processors (see comment 12) above them in the corresponding 3D assembly. When transmitting messages over networks, the following are used: a large number of message formats; methods for aggregating short messages into long messages and reverse conversion; simultaneous transmission of messages via an optical channel at different wavelengths (WDM technology, see comment 15), active messages such as RPC (remote procedure call);

7. Three “pass-through” buffers for temporal isolation of asynchronously operating data and message lines;

8. Non-volatile flash memory;

9. Processor of processing messages and memory data “on the fly”;

10. A network processor, through which a connection to an external communication network for transferring messages between assembly structures is realized;

11. I/O processor, through which the connection to the external I/O network is realized;

12. Mass-multithread memory processor, through which the connection to the 3D DRAM-memory module is realized (it is located in this place in the assembly construct), as well as the connection to the Network Processor and the I/O Processor is realized. This massively multithreaded processor is a key element in ensuring that each subassembly is tolerant to memory, communications, and I/O latency;

13. Routers for internal assembly structures of networks with a two-dimensional lattice topology for transmitting data and control messages (parsels, green buses). If there are few parsels, then their subnet is automatically used to transmit ordinary messages;

14. Electrical connection to the I/O subsystem;

15. WDM - transceiver, is used to implement the transmission of messages through fiber simultaneously at different wavelengths;

16. Each yellow square is an array of low-power lasers connected through a microlens array to a fiber beam (IBM Holley chip technology);

17. Yellow lines are intended for data transmission in the given assembly structure;

18. Green trunks are intended for transferring parsels (active messages such as remote procedure calls) to etc. If there are few such parsels on the task, then these trunks are used to transfer data, which is indicated by the transition of the green coloring of the trunk to yellow.

Conclusion

We also note that the time that has passed since the release of the first version of the technical report has confirmed the opinion that Japanese scientists and engineers are more closed. For example, they do not share their nano-joining technologies with foreign specialists, which allowed them to achieve 0.6-1 fJ / bit / mm. Information on long (hundreds of kilometers) optical communication lines with quantum sewing and a bandwidth of up to 1 Tbit / s is closed. In addition, Japanese specialists (which the authors cannot say about Chinese specialists) prove their significant creativity, successfully moving forward in new directions of development of the element base, such as fast singlequantum logic superconductivity (RSFQ), as well as quantum cellular automata (QCA), which require fundamental research and discoveries, quantum computers (Quantum Computing) and quantum algorithms.

References

  1. A Molyakov (2019) China Net: Military and Special Supercomputer Centers. Journal of Electrical and Electronic Engineering 7(4): 95-100.
  2. Sterling T (2011) In Search of Exascale Roadmap.
  3. T Sterling (2007) Architecture Paths to Exaflops Computing Is Multicore the next Moore’s Law? What about the Memory?
  4. A Molyakov (2019) Age of Great Chinese Dragon: Supercomputer Centers and High-Performance Computing. Journal of Electrical and Electronic Engineering 7(4): 87-94.
  5. A Molyakov (2019) Analysis of World Experience in Creating Parallel Computing Systems Designed to Effectively Solve DIS-tasks. Journal of Electrical and Electronic Engineering 7(4): 101-106.
  6. XJ Yang, Y Dou, QF Hu (2006) Progress and Challenges in High Performance Computer Technology. Journal of Computer Science & Technology 21: 674-681.
  7. PM Kogge (2007) Computer systems with lightweight multi-threaded architectures. United States Patent Application Publication.