Inter-Integrated Circuit

The I2C is a multi-leader, multi-follower, serial communication protocol between digital devices. In this section, we will cover the working principles of the I2C in terms of the data format, connection diagram, and transmission and reception operations.

Data Format

In the I2C, every follower has a unique address. Data transfer starts with this address. When the follower wakes up and acknowledges back the leader, transfer continues with the pointer/address and data or directly data is transferred depending on the protocol. The address of a follower is usually composed of seven bits. However, in some cases the address can be either eight or ten bits. Independent of the address, pointer, and data size, the transfer is performed in terms of eight-bit packages. Each package has seven-bit address, pointer, and data and one-bit acknowledge value. The receiver merges packet to extract data.

Connection Diagram

The I2C data bus has two wires called serial data line (SDA) and serial clock line (SCL). Besides, all connected devices need a common ground and power line. As a result, the I2C will need four wires for communication. The connection diagram of a generic I2C is presented as . The SDA and SCL are bidirectional lines. Both the lines are connected to VDD by a pull-up resistor. This means they are at logic level 1 when idle. Different from the SPI, every follower has a unique address in the I2C. Therefore, the follower and leader can be chosen over the serial data line without the need of a select signal. Thus, other than power and ground signals, the I2C bus has only two wires connected to all devices. This advantage saves the pin usage compared to the SPI.

Transmission and Reception Operations

As mentioned in the previous section, data on the I2C communication is carried by eight-bit packages. The leader starts the transmission by sending the follower address and read/write decision bit. The follower with this address on the network wakes up and acknowledges the leader that it is alive and ready to talk. Then depending on the decision bit, the leader writes or reads data from the follower. The leader ends the talk by sending a stop signal. The following figure shows the complete timing diagram of the I2C communication:, The leader starts transmission by a logic level 1 to 0 transition on SDA while SCL stays at logic level 1. We can call this as the start signal. The transmission ends by a logic level 0 to 1 transition on the SDA while SCL is at logic level 1. We can call this as the stop signal. The address of the device and data is transmit address of the follower. Then R/W signal is sent, which tells the follower if the leader is going to read of write the data to/from the follower. Next, the leader starts sending or receiving data (with the MSB first) followed by an acknowledge signal. There are no restrictions on the number of successively transmitted data bits. The communication continues until the leader sends the stop signal. Note that during the acknowledge signal the transmitter releases the SDA line and the receiver pulls the line to logic level 0 while SCL is at logic level 1.



How GPU works?

graphics processing unit (GPU) is a specialized electronic circuit designed to rapidly manipulate and alter memory to accelerate the creation of images in a frame buffer intended for output to a display device. GPUs are used in embedded systemsmobile phonespersonal computersworkstations, and game consoles.

A really comprehensive lecture from Pro. Mutlu:

GPUs and GPGPU Programming (pdf)

What is a shader?

In computer graphics, a shader is a type of computer program originally used for shading in 3D scenes (the production of appropriate levels of lightdarkness, and color in a rendered image). They now perform a variety of specialized functions in various fields within the category of computer graphics special effects, or else do video post-processing unrelated to shading, or even perform functions unrelated to graphics at all.

What is Per-Tile Rasterization (Renderer)?

Rasterization: The process of determining which pixels a given primitive touches. Rasterization and pixel coloring are performed on a per-tile basis with the following steps:

  1. When a tile operation begins, the corresponding tile list is retrieved from the Parameter Buffer (PB) to identify the screen-space primitive data that needs to be fetched.
  2. The Image Synthesis Processor (ISP) fetches the primitive data and performs Hidden Surface Removal (HSR), along with depth and stencil tests. The ISP only fetches screen-space position data for the geometry within the tile.
  3. The Tag Buffer contains information about which triangle is on the top for each pixel.
  4. The Texture and Shading Processor (TSP) then applies coloring operations, like fragment shaders, to the visible pixels.
  5. Alpha testing and subsequently alpha blending is then carried out.
  6. Once the tile’s render is complete, the color data is written to the frame buffer in system memory.

This process is repeated until all tiles have been processed and the frame buffer is complete.

What is Vertex Processing (Tiler)?

Every frame, the hardware processes submitted geometry data with the following steps:

  1. The execution of application-defined transformations, such as vertex shaders (Vertex Processing).
  2. The resulting data is then converted to screen-space (Clip, Project, and Cull).
  3. The Tile Accelerator (TA) then determines which tiles contain each transformed primitive (Tiling).
  4. Per-tile lists are then updated to track the primitives which fall within the bounds of each tile.

Each tile in the list contains primitive lists which contain pointers to the transformed vertex data. The tile list and the transformed vertex data are both stored in an intermediate store called the Parameter Buffer (PB). This store resides in system memory, and is mostly managed by the hardware. It contains all information needed to render the tiles.

Vertex Processing (Tiler)

What is Tile-Based Deferred Rendering (TBDR)?

The usual rendering technique on most GPUs is called Immediate Mode Rendering (IMR) on which geometry is sent to the GPU, and gets drawn straight away. This simple architecture is relatively inefficient, resulting in wasted processing power and memory bandwidth. Pixels are often still rendered despite never being visible on the screen, such as when a car is completely obscured by a closer building.

Tile-Based Deferred Rendering architecture works in a much intelligent way. It captures the whole scene before starting to render, thus occluded pixels can be identified and rejected before they are processed. The hardware starts splitting up the geometry data into small rectangular regions that will be processed as one image, which we call “tiles”. Every tile is rasterized and processed separately, and as the size of the render is so small, this allows all data to be kept on very fast chip memory.

Deferred rendering means that the architecture will defer all texturing and shading operations until all objects have been tested for visibility. This significantly reduces system memory bandwidth requirements, which in turn increases performance and reduces power requirements. This is a critical advantage for phones, tablets, and other devices where battery life makes all the difference.