Suyash Sambhare

Posted on Jul 30, 2024

SCSI Logic

#devops #scsi #storage #iscsi

The SCSI communications architecture is a logical framework for exchanging commands and responses between systems and storage, which includes addressing, naming, and error-correction methods. The basic purpose of the SCSI architecture is to create a dependable abstraction layer between computers and storage devices. Without a storage abstraction layer, each application would be required to include information on the activities of each storage device used with it. This arrangement was clearly unacceptable, thus the computer and storage sectors needed to create standard interfaces for both systems and storage devices. SCSI was created as a common storage abstraction for open-systems computers.

Background

SCSI was developed in 1981 by Shugart Associates and NCR Corporation, who were both looking for ways to connect disk drives to systems. SCSI _was initially known as _SASI, or Shugart Associates Systems Interface. In December 1981, the ANSI standards organization established the X3T9.2 technical committee to continue the development of this work, which was called SCSI. The initial SCSI standard was adopted and published in 1986. At the time, the protocol and connectivity were tightly integrated into a single combined technology.

Since then, SCSI has seen two major extensions. SCSI-2 extended the width of the SCSI bus and its clock speed. SCSI-3 delineated the architectural structure of SCSI communications, separated the various technology aspects, and established multiple, independent groups to work on these elements concurrently.

The T10 SCSI Standards Committees: The ongoing standards work in SCSI is performed by the T10 Technical Committee of the International Committee for Information Technology Standards (INCITS). http://www.t10.org/.
SCSI-3 Connection Independence: The most notable change between SCSI-2 and SCSI-3 was the abstraction of logical storage functions from the underlying connection technology. This separation allows SCSI processes to be transmitted as an application over any kind of network. In general, you can assume the use of SCSI-3 logic, processes, and protocols in any type of network storage implementation. The SCSI-3 standards documents make it clear that SCSI protocols are intended to be implemented independently of the connecting technology.

SCSI Architecture Model

The SCSI Architecture Model (SAM) defines one of the essential parts of the SCSI protocol: the communications architecture for exchanging storage commands and data.

Initiators and targets
Initiator and target ports
SCSI RPC structure
Overlapped I/O
Asymmetrical communications in SCSI
Dual-mode controllers
No guarantee of ordered delivery
SCSI ports, IDs, and names
SCSI LUNs
Tasks, task sets, and tagged tasks
SCSI nexus and connection relationships
Tagged command queuing

Initiators and Targets

The distributed communications between initiators and targets is the foundation of the SCSI protocol. There is no reason to restrict one's understanding of storage I/O to systems and storage; in general, initiators are implemented in HBAs and systems, and targets are implemented in devices and subsystems. Targets and initiators can be used in a variety of ways if their respective functions are understood.

The target controller responds to the request and acts upon it once the initiator controller issues a command. Initiators and targets are also referred to as clients and servers in the standard. SCSI components do not resemble what most people think of as clients and servers, even when the client/server communication model is used - especially in an environment where network attached storage (NAS) clients and servers are present.

Initiator and Target Ports

Network ports that are a component of initiator and target controllers are seen as being a part of the connecting network in the SCSI architecture model, rather than being a part of the initiator or target controller function. Although it may not make sense, this is the appropriate functional differentiation. SCSI storage processes are not impacted by network ports or the low-level drivers that manage PDU formation and recognition as well as network activity. They lack storing roles, but they do have connecting roles. The communication ports are recognized by the same HBA and disk drive as belonging to the network connection rather than the SCSI logical process.

SCSI RPC Structure

According to the SCSI architecture model, information is exchanged between two controllers in the form of commands and responses. Remote procedure calls are used in initiator-target communications, in which the initiator sends a command together with the data to be delivered and any related execution parameters. This instruction is intended for a certain target controller ID. Following processing of the command, the target provides any desired outgoing data as well as any further details regarding the command's completion status, including any faults or failures.

The command is issued, and both the initiator and the target disengage to carry out their separate responsibilities. This type of asynchronous communication occurs between them. The initiator is notified by the target when it has anything to say, and they re-establish contact to manage the data transfer.
The premise behind SCSI's architecture was that a host system would multitask. As a result, the SCSI communications model is predicated on the idea that targets may be preoccupied with other work or have a number of jobs to do, making it impossible for them to instantly react to new commands.

The command is considered pending when the initiator has completed sending it to its network port. The command is completed when the destination replies to the initiator, processing any accompanying data transfers (READ or WRITE commands). A status message, such as an error or failure, or a confirmation that the command has been completed can be the response.

After delivering the command, the initiator breaks up interaction with the target to free it up for other tasks. This indicates that an HBA's device driver processes target answers as interrupts. It is simple to understand how the operating system kernel and the HBA's device driver software communicate, given the volume of interrupts that result from storage I/O processes and the need to manage these interrupts appropriately and promptly. For this reason, it is crucial to verify whether the OS supports the distinct kinds of accessible device drivers and storage HBAs. One should presume that the HBA and driver will not function unless an OS level is specifically included in the device driver's support list.

The rate at which host I/O interrupts are caused by target-side answers is one of the difficulties. In addition to the application mix, target-side variables also include cache size, RAID level, device capabilities, and file system fragmentation. In a development or test environment, it is in fact exceedingly difficult, if not impossible, to mimic real-world events. That is one of the causes of occasional mishaps.

Overlapped I/O

I/Os can overlap across a connected bus or network thanks to the SCSI communications design. A host initiator can send out another command to the same target or to other targets before getting a response for the initial command. Target responses for various I/Os may be received in the order in which they are processed.
An initiator can initiate a command that takes several minutes to complete on a tape drive, and then finish thousands of orders sent to disk drives that finish in a fraction of a second apiece. Overlapped I/Os allow for exceptionally efficient SCSI communications and a high degree of parallelism for I/O communications.

Asymmetrical Communications in SCSI

SCSI is a non-symmetric communications model, in contrast to most other data networks. Both parties engage with distinctly separate users and programs and conduct diverse tasks. Applications employ initiators, who give directions and then watch for a response from targets. Targets read and write data to storage media while they wait for orders from initiators. Targets perform tasks on behalf of the media.

On the target side, the media is the last item in the line. Because media is at one end of the SCSI communication channel and a processor running applications is at the other, the communication channel is asymmetrical. Therefore, even if intelligent processors can be installed in storage subsystems and devices, the SCSI protocol was created to control block storage addresses on unintelligent media. All of this means that an "intelligent" storage controller on the receiving end of the exchange is unable to interact with data objects and is only able to manipulate addresses and commands.

Typical host HBA processes and storage target processes in devices or subsystems differ from one another. The target must manage internal storage targets and the details of communications with several external initiators, while the HBA is responsible for monitoring operating system details.

Dual-Mode Controllers

It is possible to build controllers that can perform initiator and target functions. The SCSI EXTENDED COPY command can be implemented with the help of dual-mode controllers. Virtualization appliances and SAN routers may also use dual-mode controllers.
One physical port on a circuit board acting as an initiator and another port as a target is the easiest way to conceptualize a dual-mode controller. This kind of design might be used in a storage subsystem. The utilization of a single network port that serves as both an initiator and a destination is another implementation approach for a dual-mode controller. It is obvious that the controller in this situation must distinguish between the two distinct roles.

No Guarantee of Ordered Delivery

To preserve data integrity, it's also frequently believed that SCSI offers in-order delivery. Since the SCSI bus has always managed in-order delivery, the SCSI protocol layer does not require it. The SCSI protocol in SCSI-3 relies on the underlying connection technology to achieve correct ordering. Put differently, the network oversees rearranging transmission frames that are received out of order because the SCSI protocol has a built-in reordering mechanism. For the iSCSI protocol, which sends SCSI commands and data via IP networking equipment, TCP was deemed necessary since it offered ordering, but UDP and other upper-layer protocols did not.

SCSI Ports, IDs, and Names

SCSI controllers can concurrently interact over several networks via several ports. Additionally, they can communicate across several ports linked to a single network. Given the degree of flexibility incorporated into the architecture, a means of distinguishing ports must undoubtedly exist to guarantee secure and reliable operations.
An approach for identifying controllers and their ports is provided by the SCSI standard. On every network to which they are linked, every controller port needs to have a unique identity, or port ID. For instance, the SCSI bus port ID is an integer between 0 and 15. In an FC fabric network, the port ID is a 24-bit network address.

Every port is individually identified by its port name in addition to its port ID. This port name is referred to as the worldwide name (WWN) in FC. Each port is given a 64-bit hexadecimal value during the controller's manufacturing process. Initiators and controllers in a SAN can be identified and addressed using these global name values. In storage networking, naming can be complicated. The global node name (WWNN), which is an optional node name in addition to the port name, occasionally necessitates the usage of the abbreviation WWPN to distinguish the node name from the port name.

There are a few challenging problems with the idea of identifying the node as a means of uniquely identifying a system or subsystem. For instance, the WWNN is produced by the HBA rather than the system itself, even though its purpose is to identify a system. The issue is that a system may have several HBAs for various uses, and those HBAs may have distinct WWNNs. In some scenarios, it might make sense to have a single WWNN for each HBA in the system, but in other scenarios, having distinct WWNNs might be better.

Because the address space on a parallel SCSI bus could only accommodate less than 16 controllers prior to SANs, there was less of a requirement for controller naming. It is evident that in a SAN system with millions of addresses, the parallel SCSI method of physically placing jumpers or setting numerical values could not function.
When one considers recovering from a calamity, such a widespread power outage, the urgency of precisely and swiftly recreating storage configurations becomes evident. Restoring the logical structure of the SAN fast and precisely could be quite challenging without a means to uniquely identify storage resources. Having a persistent way to find and use storage resources is critical. The combination of WWNs and the use of name services in SAN switches provides this mechanism.

SCSI LUNs

Logical units on SCSI targets give SCSI instructions the processing context they need. Logical units are virtual machines that manage SCSI connections for target storage devices, whether they are virtual or actual. A task router in the target controller routes commands received by targets to the relevant logical unit.

The task manager and the device server are the two distinct roles that make up the logical unit's workload. In addition to conducting directives from initiators, the device server oversees error detection and reporting. As the logical unit's work scheduler, the task manager chooses the order in which instructions are executed in the queue and responds to inquiries from initiators regarding commands that are still pending.

A target's logical unit can be uniquely identified by its logical unit number (LUN). While we often associate the term "LUN" with physical or virtual storage, a LUN is an access point that allows initiators and targets to communicate commands and status data. Logical units can be thought of as "black box" processors, and the LUN is just a means of identifying SCSI black boxes. Logical units can be accessed by any port on the target using a LUN and are architecturally independent of the target ports. A target may optionally support more LUNs in addition to the minimum requirement of LUN 0. A subsystem may enable the definition of hundreds of LUNs, while a disk drive may only use one.

In a SAN storage subsystem, declaring a LUN on a certain target port and then allocating that target/LUN pair to a logical unit constitute the provisioning of storage process. Multiple LUNs on distinct ports can represent a single logical unit. A logical unit, for example, might be accessed as LUN 12 on port 1 of the same target and through LUN 1 on port 0 of the same target. Considering how the term "provisioning" has been used in data networking before. The above-described storage procedure has always looked to resemble network protocol "binding" far more than network provisioning. The following is an explanation of storage provisioning: It is the action of binding a virtual storage machine to a specific set of storage resources and making them available by one or more LUN IDs across one or more target ports.

Tasks, Task Sets, and Tagged Tasks

Logical units manage commands as tasks. Task sets are collections of one or more queues containing SCSI tasks. Tasks are conducted by the device server, and task managers, well, they manage the different tasks inside task sets. SCSI optimizes storage I/O while also allowing for prioritization using multiple queues. To meet the I/O demands of multiprocessing servers, SCSI queue management offers a multitasking environment for storage I/O processes.

Compared to straightforward first-in, first-out (FIFO) or last-in, first-out (LIFO) queues, where low-priority applications may cause I/O system bottlenecks, SCSI's queuing capabilities are far more flexible. Rather, SCSI's queuing simultaneously meets the I/O needs of numerous applications. This is one way that high-throughput, multiprocessing computer environments are supported by the SCSI architecture.

Additional designations for tasks include tagged and untagged. With tagging, a set of commands can be moved from an initiator to a logical unit by giving each command a sequential identification. Tasks linked with tagged commands are grouped together in a task set, and the task manager could modify the sequence in which they are completed. The logical unit reacts to the initiator by utilizing the tag to identify the task as each job is finished. Database applications and other multitasking apps that support numerous independent I/Os use tagging.

SCSI Nexus and Connection Relationships

The nexus object describes the initiator/storage communication relationship.

There are three nexus objects in SCSI:

Initiator/target (I_T nexus)
Initiator/target/LUN (I_T_L nexus)
Initiator/target/LUN/tag (I_T_L_Q nexus)

The maximum number of concurrent commands that can be pending at once depends on the type of nexus object being utilized. A single command between an initiator and a particular target can only be sent via an I_T nexus. A solitary command between an initiator and a particular logical unit is permitted by an I_T_L nexus. If the commands are tagged, an I_T_L_Q nexus permits a large number of potential commands to be pending.
The SCSI nexus specifies the SCSI path elements used for storage I/O processes, including multipathing. Many networking professionals are used to thinking about pathways in TCP/IP networks, therefore this definition of path may not be clear to them. The underlying connecting bus or network is visible to SCSI logical processes; hence it cannot be included in the SCSI path. The SCSI nexus entities therefore define the entire SCSI path.

Tagged Command Queuing

The most essential aspect of tagging in SCSI is tagged command queuing (TCQ), which allows the logical unit's task manager to reschedule jobs to improve the performance of a storage device or subsystem.
Tagged command queuing was created to improve the performance of mechanical components in disk drives, specifically disk arms and actuators. The main idea is to reorganize a set of commands to lower the overall latency of finding tracks on disk platters.

Assume that a task set contains many tagged tasks, each with the instruction to read or write data across a random distribution of tracks on disk medium. Without the option to reassign workloads, the seek time latency would correspond to the drive's average seek time. Using command queuing, the tasks might be organized so that the actuator moves the least amount for each task as it goes from one task's track to the track of its closest neighbour.
Tagged command queuing decreases the seek time latency for disk I/O operations. The degree of improvement is dependent on the queue depth. In general, higher queue depths result in shorter average seek times and better storage performance.

Ref: https://flylib.com/books/en/2.393.1.45/1/

Debug School