Journal of Signal Processing Systems

, Volume 88, Issue 1, pp 67–81 | Cite as

Reconfigurable Parser Architecture Design with Microprogrammed Controller for Multiple Purposes

  • Gwo Giun (Chris) Lee
  • Chun-Fu (Richard) Chen
  • Ching-Jui Hsiao
Article

Abstract

This paper utilizes the flexibility of microprogrammed controller with reloadable microcodes for developing reconfigurable parser for multiple purposes. Based on control-dataflow, with microprogrammed controller taken into consideration due to the nature of feedback control in parser, this paper proposes a reconfigurable parser through extracting control commonalities to form shared microinstructions so that the current architecture alleviates the cost in switching control signals for distinct purposes. We employ reconfigurable video coding as a case study to justify the advantages in reconfigurable parser with microprogrammed controller in comparison with finite state machine based controller. Using TSMC 0.18 μm CMOS technology at 108 MHz operating frequency, we reduce 8.93 % gate counts and increase throughput rate twice in comparison with individually implemented finite state machine based controller. We have demonstrated that microprogrammed controller is the trend of flexible architecture design for multiple purposes. Owing to high proportion of shared microinstruction, the higher saving ratio could be envision when multiple purposes are involved in the proposed reconfigurable parser, e.g., more video coding standards.

Keywords

Parser Reconfigurable architecture Microprogrammed controller 

1 Introduction

Renconfigurability plays an important role in multi-purpose architecture design since it aims to flexibly support multi-functionalities without redesigning common modules. Figure 1 shows the tradeoff between flexibility and performance per area among typical architectures. The most flexible architecture, General Purpose CPU (GPCPU), can be reprogrammable easily but its performance per area is usually worse than other architectures; on the other hand, Application-Specific Integrated Circuit (ASIC) reaches the best performance per area but designers hardly modify its functionality easily after fabrication. Due to high demand of interoperable architecture for multi-purpose, the reconfigurable architecture is the trend of developing emerging architecture design to flexibly support multi-purpose with relatively high performance per area
Figure 1

The characteristics of different platforms.

The reconfigurable architecture usually focuses on the datapath design [1, 2, 3], and those works accomplish multi-functionality by means of reconfiguring datapath to reduce architectural cost, but retain identical throughput in comparison with individual implementations. Nonetheless, if we can extend the reconfigurability into controller, such reconfigurable architectures become more comprehensive since it not only dynamically changes datapath in run-time but also reconfigures control signals. Regarding control units, researchers usually discuss combinational and sequential control units; then, they categorize sequential control unit into Finite State Machine (FSM) and microprogrammed based controller [4]. According to current state and input, FSM transits the state and generates control signals. The Moore machine transits the states according to next state function which is used to be implemented in either combinational logics or a sequencer (or microsequencer) and outputs control signals according to current stat. On the other hand, the microprogrammed controller executes a set of predefined microinstructions in microcode memory, and each microinstruction contains the control signals for generating address of next microinstruction. Hence, the microprogrammed controller can be separated into two parts, i.e., microsequencer that generates the address of next microinstruction, and microcode memory, which stores microinstructions. We can reload different microcodes into memory to reconfigure control unit as shown in Fig. 2. Therefore, the microprogrammed controller supports various applications, as long as the microinstructions are well-defined, the microprogrammed controller are capable of reconfiguring a control unit for certain supported applications without any modification. On the other hand, the Moore machine outputs the control signals with hardcoded combinational logics, which is almost zero flexibility. Moreover, resource cost and timing requirement of microprogrammed controller is more predictable than the FSM-based design since the microcode is usually implemented by memory, whose timing requirement is determined by technology of memory; hence, designers take the advantages in balancing other modules with microprogrammed controller.
Figure 2

Reloading mircocodes to configure the control signal of microsequencer.

To justify the capability of reconfigurable control unit, we choose video coding standards as our case study in this paper since the Moving Picture Expert Group (MPEG) instituted by International Standards Organization (ISO) has developed the several video coding standards, such as MPEG-1, MPEG-2 [5] MPEG-4 Advanced Video Coding (AVC)/H.264 [6], High Efficiency Video Coding (HEVC) [7] and these international standards are widely used in different applications. For example, MPEG-1 is applied to both digital storage media, such as CD-ROM, and computer/telecommunication networks. The DVD, digital TV and HDTV widespread in our daily life are related to MPEG-2; AVC/H.264 covers the digital video broadcasting, IPTV, etc.; HEVC provides higher compression capability than AVC/H.264 but its popularity is still less than AVC/H.264 for the commercial products. On the other hand, Reconfigurable Video Coding (RVC) [8, 9] creates a flexible framework for various existing video coding standards and emerging video coding standards. The RVC would like to resolve the interoperable problem of multiple video coding standards and lowers the granularity to extract the commonalities among supported standards in alleviating implementation cost.

The flexibility of microprogrammed controller is already utilized in other research area, such as memory built-in self-test [10], FPGA synthesis [11]. Kannangara et al. [12] proposed decoder description syntax in decoded bitstream to explicitly configure function units at decoder. This work still focuses on the reconfiguration on datapath not control part; however, the proposed approach should add non-standard syntax into bitstream, it would restrict the usability of encoded bitstream since other decoders are not capable of decoding those bitstream. Lucarz et al. [13] systematically synthesized a new codec based on the existed tool sets and the corresponding parser for decoder description; however, it did not disclosure the interoperability between different standards and support more than one standard simultaneously. Other papers even discuss about the run-time reconfiguration and current RVC activities are also toward parser’s reconfigurability [14, 15, 16]; however, microprogrammed controller is still not deployed to parsing control signals. This paper is to disclose the novelty in reconfiguring different standards inside one reconfigurable parser. The proposed parser, which involves complicated control signals for different standards, in supporting multiple coding standards would be a good design example for reconfigurable parser design. Therefore, we select reconfigurable parser as a case study to fulfill the idea in reconfigurable control architecture. The supported specifications of proposed parser are 1920 × 1088@30fps for MPEG-2 and 1920 × 1088@60fps for AVC/H.264 in Context-Adaptively Variable Length Coding (CAVLC) mode.

This paper is organized as follows. The parsing flows in MPEG-2 and AVC/H.264 are illustrated in section 2 to understand the supported functionalities, and then the design of reconfigurable parser with microprogrammed controller is demonstrated in section 3. Section 4 shows comparison between the proposed reconfigurable parser with microprogrammed controller and the FSM-based controller. Lastly, we conclude our work in section 5.

2 Introduction to Parsing Flows in Supported Functionalities of Reconfigurable Parser

Based on supported functionalities, we extract the commonalities in both datapath and control. The proposed reconfigurable parser not only parses the syntax elements from bitstream but also decodes the semantic of the syntax elements at Macroblock (MB) level according to the specified syntax structure. The existence of the syntax elements in the bitstream depends on the condition imposed on the syntax elements. The condition is expressed by the previous parsed syntax element or decoded semantic of syntax elements above or equal to MB level. Since this paper focuses on reconfigurable parser, we refer to our previous work [17, 18] on Variable Length Coding (VLC) decoder in MPEG-2 and Context-Adaptive Variable Length Coding (CAVLC) decoder in AVC/H.264 to cooperate with the reconfigurable parser; furthermore, since VLC and CAVLC decoders act as regular computations, i.e., datapath, rather than the controller of reconfigurable parser, VLC and CAVLC decoders are isolated to be individual modules; in addition, we would not discuss the context-adaptive binary arithmetic coding decoder in this paper.

The parsing flow in MPEG-2 of reconfigurable parser is shown in Fig. 3a. The reconfigurable parser parses existed syntax elements for each MB within current slice until the end of the slice. If the condition imposed on current syntax element is true, the parsing process associated with this syntax element is activated. The parsing processes are either fixed length code parsing or variable length code parsing. If the condition is false, reconfigurable parser judges whether the transform coefficient decoding that belongs to VLC decoder should be activated. If the transform coefficient decoding is activated, VLC decoder decodes the transform coefficient; after that, the reconfigurable parser stops parsing. Otherwise, the reconfigurable parser directly jumps to judge next condition for the existence of next syntax element.
Figure 3

Parsing flow of reconfigurable parser in a MPEG-2 mode, and b AVC/H.264 mode.

On the other hand, the parsing flow in AVC/H.264 of reconfigurable parser is shown in Fig. 3b. The parsing flows in AVC/H.264 and MPEG-2 of reconfigurable parser are similar; but the syntax structure and the conditions imposed on the corresponding syntax elements are different. Besides, the parsing processes in AVC/H.264 are Exp-Golomb code parsing and fixed length code parsing. The transform coefficient decode belongs to CAVLC decoder which is an individual module outside the reconfigurable parser. CAVLC decoder parses the syntax elements in block level and decode the semantic of these syntax element which are required for the decoding process of transform coefficient. Unlike VLC parses both syntax elements and transform coefficient, CAVLC decoder only parses the transform coefficient. Consequently, based on the parsing flows in both MPEG-2 and AVC/H.264, we study control-dataflow at lower level to extract the commonality and similarity to accomplish the reconfigurable parser.

3 Proposed Reconfigurable Parser with Microprogrammed Controller

We utilize low level dataflow for exploring design space of reconfigurable parser; and hence we can determine suitable data granularity for the reconfigurable architecture. During integrating multiple purposes into one architecture, not only the control unit but also the datapath should be reconfigured if possible. Therefore, after exploring the low level dataflow, we extract the commonalities for certain modules to increase flexibility. Lastly, we develop the reconfigurable parser with microprogrammed controller based on low level dataflow and extracted commonalities in datapath.

3.1 Low Level Dataflow of Reconfigurable Parser

Flowchart is one of algorithmic representations and helps designers to understand algorithmic functionality or behavior in high level; however, flowchart did not reveal the data granularity and conveys less architectural information than dataflow. Hence, we use dataflow that concurrently reveals algorithm and architecture to assist in mapping from algorithm to architecture [19]. Designers can explore abundantly architectural information from dataflow at various data granularities., We use dataflows to explore the reconfigurability; in addition, since the reconfigurable parser deals with the operations at bit level, we jump to low level dataflow directly.

In usual, we model those algorithms without feedback loop to control unit in dataflow. Hence, we schedule the flow of data without considering control signals; subsequently, dataflow decides control signals to fulfill such dataflow. In the reconfigurable parser, however, the procedure of parsing syntax elements creates a feedback loop since the condition imposed on syntax element thats associated to previously parsed syntax elements. That is, designing control unit also affects the dataflow. Therefore, to design reconfigurable parser, we should take control flow into consideration while we are scheduling dataflows.

Figure 4 presents the block diagram of reconfigurable parser with microprogrammed controller and this block diagram is designed by exploring low level dataflow and based on previous reconfigurable VLC decoder [17]. The left part, upper and lower registers, is typical constant output rate architecture that ensures the bits in upper and lower registers contain at least one symbol for VLC decoder; hence, we assure the output rate of Exp-Golomb code decoder. This part is also shared by CAVLC and VLC decoder [17] to reduce cycles for communicating between reconfigurable parser and CAVLC or VLC decoder. After the syntax element is decoded from codeword, the syntax element flows directly to syntax element registers. The semantics decoder takes syntax elements from both registered and non-registered output of shared syntax element registers. The not exist syntax element assignment is responsible for that the syntax element does not exist and then assigns a specific value to this syntax element. The Entropy-SysCtrl FIFO is used to communicate between entropy decoder (CAVLC and VLC decoder) and other controller with ease. According to operating mode, one of the CAVLC decoder and VLC decoder decodes the block level syntax elements and the transform coefficients for following procedures, which are inverse quantization and inverse transform.
Figure 4

Architecture of reconfigurable parser with microprogrammed controller.

Due to the feedback control signals in reconfigurable parser, efficient pipeline architecture is unavailable. If we insert registers into the datapath, bubbles will appear in dataflow; otherwise, the parsing result will be wrong. Although efficient pipeline architecture is not available, we can still cut off the datapath with register to shorten the critical path; however, to keep the same throughput, clock rate should be increased. It is one of the solutions to solve the long critical path which exists in reconfigurable parser, but not the best one.

To find proficient architecture, we add control signals, microprogrammed controller, into dataflow and it is called as control-dataflow to search the optimal architecture. We take advantages of the flexibility in microprogrammed controller to develop our reconfigurable parser in both designs of datapath and control unit. Taking one intra 16 × 16 MB as an example, its control-dataflow corresponding to Fig. 4 in MPEG-2 and AVC/H.264 mode is shown in Tables 1 and 2, respectively. It is noticeable that the input of microprogrammed controller could be the registered and non-registered since it should accommodate with different decoding scenarios. The input selection of microprogrammed controller depends on whether or not the syntax elements are parsed. For example, the control-dataflow of AVC/H.264 in Table 1, in the beginning, the microprogrammed controller is dealing with the next control of mb_type, CTRL_intra_chroma_pred_mode, which requires the semantics of mb_type as its input, while the mb_type is being parsed. The input of microprogrammed controller would be switched to non-registered one with the control signal specifying the mb_type directly. Second, when the microprogrammed controller is handling with the next control of intra_chroma_pred_mode, CTRL_mb_qp_delta, which requires the semantics of mb_type as its input again, the semantics of mb_type is previously decoded and stored in the register. And the input of microprogrammed controller can be switched to the registered one with the control signal specifying the mb_type. Through combining two above perspectives, we state that such scheduling is valid and efficient since no bubble will appear in the schedule which decreases the throughput. On the other hand, because the lookup table of VLC decoder is implemented by ROM, its output could be available one cycle after request. Hence, bubbles appear in the control-dataflow; however, since the throughput demand in MPEG-2 mode is less than AVC/H.264; thus, such scheduling is still acceptable for our specification. Finally, we develop the architecture with both registered and non-registered input of microprogrammed controller via exploring control-dataflow.
Table 1

Control-dataflow of reconfigurable parser in AVC/H.264 mode.

Cycle

Reconfigurable parser

Entropy-SysCtrl FIFO

CAVLD

Bitstream FIFO

Lower register

Upper register

Barrel shifter

Reconfigurable Exp-Golomb code parser & Fixed length code parser & Semantics decoder

Not exist syntax element assignment

Microprogrammed controller

Shared Syntax Element & Semantics Registers

0

Bits_2

Bits_1

Bits_0

Codeword_0

mb_type

coded_block_pattern

transform_size_8x8_flag

CTRL_mb_type

   

1

Bits_2

Bits_1

Bits_0

Codeword_1

intra_chroma_pred_mode

 

CTRL_intra_chroma_pred_mode

mb_type

coded_block_pattern

transform_size_8x8_flag

  

2

Bits_2

Bits_1

Bits_0

Codeword_2

mb_qp_delta

 

CTRL_mb_qp_delta

intra_chroma_pred_mode

mb_type

transform_size_8x8_flag

mb_type

coded_block_pattern

3

Bits_2

Bits_1

Bits_0

Codeword_3

   

mb_qp_delta

intra_chroma_pred_mode

 

4

Bits_2

Bits_1

Bits_0

Codeword_4

    

mb_qp_delta

 

5

Bits_2

Bits_1

Bits_0

Codeword_5

      

 

Table 2

Control-dataflow of reconfigurable parser in MPEG-2 mode.

Cycle

Reconfigurable parser

VLD

Reconfigurable parser

Entropy-SysCtrl FIFO

Bitstream FIFO

Lower register

Upper register

Barrel shifter

Reconfigurable Exp-Golomb code parser & Fixed length code parser & Semantics decoder

Not exist syntax element assignment

Microprogrammed Controller

0

Bits_2

Bits_1

Bits_0

Codeword_0

macroblock_escape

 

CTRL_macroblock_escape

   

1

Bits_2

Bits_1

Bits_0

Codeword_1

 

frame_motion_type

CTRL_macroblock_address_increment

   

2

Bits_2

Bits_1

Bits_0

Codeword_2

Bubble

  

macroblock_address_increment

frame_motion_type

 

3

Bits_2

Bits_1

Bits_0

Codeword_3

  

CTRL_macroblock_type

  

frame_motion_type

4

Bits_2

Bits_1

Bits_0

Codeword_4

Bubble

  

macroblock_type

  

5

Bits_2

Bits_1

Bits_0

Codeword_5

dct_type

 

CTRL_dct_type

 

macroblock_type

macroblock_type

6

Bits_2

Bits_1

Bits_0

Codeword_6

quantiser_scale_code

 

CTRL_quantiser_scale_code

 

dct_type

macroblock_type

7

Bits_2

Bits_1

Bits_0

Codeword_7

  

CTRL_motion_code[0][0][0]

 

quantiser_scale_code

dct_type

8

Bits_2

Bits_1

Bits_0

Codeword_8

Bubble

  

motion_code[0][0][0]

 

quantiser_scale_code

9

Bits_2

Bits_1

Bits_0

Codeword_9

motion_residual[0][0][0]

 

CTRL_motion_residual[0][0][0]

   

10

Bits_2

Bits_1

Bits_0

Codeword_10

  

CTRL_motion_code[0][0][1]

 

motion_residual[0][0][0]

 

11

Bits_2

Bits_1

Bits_0

Codeword_11

Bubble

  

motion_code[0][0][1]

 

motion_codel[0][0][0]

motion_residual[0][0][0]

12

Bits_2

Bits_1

Bits_0

Codeword_12

motion_residual[0][0][1]

 

CTRL_motion_residual[0][0][1]

   

13

Bits_2

Bits_1

Bits_0

Codeword_13

marker_bit

 

CTRL_marker_bit

 

motion_residual[0][0][1]

 

14

Bits_2

Bits_1

Bits_0

Codeword_14

  

CTRL_coded_block_pattern

 

coded_block_pattern

motion_codel[0][0][1]

motion_residual[0][0][1]

15

Bits_2

Bits_1

Bits_0

Codeword_15

Bubble

  

coded_block_pattern

  

3.2 Commonality Extraction for Reconfigurable Parser

We furthermore design common modules to increase the flexibility and reconfigurability of the proposed architecture, e.g., reconfigurable Exp-Golomb code decoder, shared syntax element registers, and microprogrammed controller. Since the data granularity of reconfigurable parser is bit-level which is at quite low level, we extract the commonality from the corresponding microarchitectures and discuss each modules in following sub-sections.

3.2.1 Reconfigurable Exp-Golomb Code Decoder

Because both standards decode syntax elements through Exp-Golomb code, we can reconfigure two modules into one to increase the flexibility and save the architectural cost. Hence, according to the behavioral descriptions of Exp-Golomb code in AVC/H.264 and MPEG-2, microarchitectures of both standards are shown in Fig. 5a-e, where ue is unsigned Exp-Golomb code; te represents truncated Exp-Golomb code; me represents mapped Exp-Golomb code; se represents signed Exp-Golomb code. We extract the commonalities through identifying the common parts, which colored by gray, in Fig. 5a–e; and then, we present reconfigurable Exp-Golomb code decoder, as shown in Fig. 5f, to fulfill all Exp-Golomb code decoding. In reconfigurable Exp-Golomb code decoder, we add additional control signals, dashed lines in Fig. 5f, to elastically configure the proposed architecture and Table 3 illustrates in the control signals for each operation mode.
Figure 5

Exp-Golomb code decoder for a ~ d AVC/H.264, e MPEG-2, f Reconfigurable Exp-Golomb code decoder.

Table 3

Control signals for Reconfigurable Exp-Golomb code decoder in operating different modes.

Operating mode

Control signals

ExpGolomb _sel0

ExpGolomb _sel1

Ref_idx_motion _residual_update_flag

ue, te or me

0

1

Don’t care

se

1

0

Don’t care

motion_residual

Don’t care

Don’t care

1

3.2.2 Shared Syntax Element Registers

When parsing a syntax element from bitstream, syntax element registers should be updated by the newly parsed value. That is, control signals shall control the update of syntax element registers. Therefore, we propose shared syntax element registers to reduce the storage size through sharing similar syntax elements in MPEG-2 and AVC/H.264 modes, e.g., macroblock_type in MPEG-2 and mb_type in AVC/H.264, both identify the type of MB. Based on the similar strategy, we share the registers for similar syntax elements, and this approach increases the flexibility and reduces storage requirement. Shared syntax element pairs in MPEG-2 and AVC/H.264 is listed in Table 4.
Table 4

Shared syntax element pairs for AVC/H.264 and MPEG-2.

AVC/H.264

MPEG-2

mb_skip_run

macroblock_address_increment

mb_type

macroblock_type

sub_mb_type

dmvector

mb_qp_delta

quantiser_scale_code

coded_block_pattern

coded_block_pattern_420

transform_size_8x8_flag

dct_type

3.3 Microsequencer with Microprogrammed Controller

Microsequencer serially sequences the microinstructions according to microcode; however, to design an efficient microsequencer, we must understand the causality between microinstructions. If no branch exists between current microinstruction and next microinstruction, microsequencer sequentially outputs microinstructions; in contrast, microsequencer shall perform dispatch to derive next microinstruction if the branch occurs between two consecutive microinstructions. This behavior is equivalent to generate next state in FSM. The microsequencers in AVC/H.264 and MPEG-2 are also similar except for the dispatching procedure.

Through exploring the relationship between the microinstructions for each standard, the variables associated with branch conditions for specific microinstruction are the input of its dispatch. (The relationship among microinstructions could be found at the appendix.) Because the arrangement of microinstructions determines the output of dispatch multiplexer, we investigate several arrangements of microinstructions in both standards to reach better arrangement by considering the arrangement that contains more serial branch address as better arrangement since it could be formulated as the base address plus the offset. Based on the relationship among the microinstructions in AVC/H.264, we adopt the one with the most serial branch address and the arrangement are shown in Table 4. In the same approach, we determine the arrangement of microinstructions for MPEG-2, as displayed in Table 5. As a consequence, we design reconfigurable microsequencer with microprogrammed controller that dynamically reloads microcodes, as shown in Fig. 6, for both standards, where AVC_flag determines the mode of reconfigurable parser with microprogrammed controller.
Table 5

Arrangement of microinstructions in AVC/H.264 and MPEG-2.

Address

AVC/H.264

MPEG-2

Microinstruction

Branch Addresses (Offset)

Microinstruction

Branch Addresses (Offset)

0

MB_initialization

0 (+0) / 1 (+1) / 2 (+2) / 3 (+3)

MB_initialization

0 (+0) / 1 (+1) / 2 (+2)

1

mb_skip_run

2 (+1) / 3 (+2) / 4 (+3)

macroblock_escape

2 (+1) / 3 (+2)

2

mb_field_decoding_flag

3 (+1)

macroblock_address_increment

2 (+0) / 3 (+1) /4 (+2)

3

mb_type

6 (+3) / 7 (+4) / 8 (+5) / 9 (+6) / 10 (+7) / 11 (+8) / 12 (+9) / 14 (+11)

skip_MB

2 (−1) / 3 (+0) / 4 (+1)

4

skip_MB

0 (−4) / 2 (−2) / 3 (−1) / 4 (+0)

macroblock_type

4 (+0) / 5 (+1) / 6 (+2) / 7 (+3) / 8 (+4) / 9 (+5)

5

intra_8x8_pred_mode

5 (+0) / 7 (+2)

frame_motion_type/

field_motion_type

6 (+1) / 7 (+2) / 8 (+3) / 9 (+4)

6

intra_4x4_pred_mode

6 (+0) / 7 (+1)

dct_type

7 (+1) / 8 (+2) / 9 (+3)

7

intra_chroma_pred_mode

11 (+4) / 13 (+6)

quantiser_scale_code

8 (+1) / 9 (+2)

8

sub_mb_type

8 (+0) / 9 (+1) / 10 (+2) / 11 (+3)

motion_vertical_field_select

9 (+1)

9

ref_idx

9 (+0) / 10 (+1) / 11(+2)

motion_code

8 (−1) / 9 (+0) / 10 (+1) / 11(+2) / 12 (+3) / 13(+4) / 14 (+5)

10

mvd

10 (+0) / 11 (+1)

coded_block_pattern

10 (+0) / 14 (+4)

11

coded_block_pattern

12 (+1) / 13(+2) / 0 (+5)

motion_residual

8 (−3) / 9 (−2) / 10 (−1) / 12 (+1) / 13(+2) / 14 (+3)

12

transform_size_8x8_flag

5 (−7) / 6 (−6) / 13 (+1)

dmvector

9 (−3) / 10 (−2) / 12 (+0) / 14 (+2)

13

mb_qp_delta

14 (+1)

marker_bit

14 (+1)

14

Transform Coefficient Decode

14 (+0) / 0(+2)

Transform Coefficient Decode

14 (+0) / 0(+2)

Figure 6

Microsequencer with microprogrammed controller.

3.4 Microcode for Microprogrammed Controller

According to the control-dataflow, the control signals in datapath define the microinstruction, e.g., ExpGolomb_sel0 and ExpGolomb_sel1 in Fig. 5f. Considering one microinstruction as a state, we arrange each microinstruction into a specific address of microcode memory as illustrated in Table 5 for both standard. After all microinstructions are specified, we design microcodes to accommodate control signals. By combining control signals, we specify each field of microinstruction for AVC/H.264 and MPEG-2 based on the determined arrangement of microinstructions.

At first, we design the microinstruction via horizontal microcode and the bit allocation of control signals and corresponding microinstructions are shown in Table 6. The horizontal microcode sets the bits for each control signal individually. The control signals colored in light gray denote shared signals in both standards. The more shared control signals present, higher reusability of reconfigurable architecture is. For instance, for codeword length control signal, bits[27:25] in Table 6, as long as there are more than one kind of parsing, the codeword length control is necessary and hence share the same field. Another example, Entropy-SysCtrl FIFO control, Entropy_SysCtrl_FIFO_CEN, unless the decoding procedure requires no pre-processed blocks or MBs, a memory space that store the parsed syntax element is necessary. For syntax element register control, bits[13:2], however, there are only half of them are shared. Nevertheless, if we adopted the vertical microcode that will be discussed later, those update flags in controlling syntax element register are also shared and hence brings higher proportion of shared microinstruction.
Table 6

Bits allocation in horizontal microcode of control signals in both standards.

Bit allocation

Control signals

AVC/H.264

MPEG-2

[27:25]

CL_sel

CL_sel

[24]

Entropy_SysCtrl_FIFO_CEN

Entropy_SysCtrl_FIFO_CEN

[23:21]

AddrOffset_Entropy_SysCtrl_FIFO_update_sel

AddrOffset_Entropy_SysCtrl_FIFO_update_sel

[20:18]

InData_Entropy_SysCtrl_FIFO_sel

InData_Entropy_SysCtrl_FIFO_sel

[17:14]

Microprogrammed_Contorl_Signal

Microprogrammed_Contorl_Signal

[13]

mb_skip_update_flag

MAI_update_flag

[12]

mb_type_update_flag

macroblock_type_update_flag

[11]

cbp_update_flag

cbp_update_flag

[10]

mb_field_update_flag

motion_code_update_flag

[9]

sub_mb_type_update_flag

dmvector_update_flag

[8]

CAVLC_decoder_start

VLC_decoder_TC_start

[7]

MB_initial_flag

MB_initial_flag

[6]

intra_NxN_update_flag

motion_type_update_flag

[5]

ref_idx_update_flag

motion_residual_update_flag

[4]

mvd_intra_chroma_update_flag

motion_vertical_update_flag

[3]

t8x8_update_flag

dct_type_update_flag

[2]

qp_delta_update_flag

q_scale_update_flag

[1]

ExpGolomb_sel0

VLC_decoder_start

[0]

ExpGolomb_sel1

x

Since each microinstruction just updates one specific register in horizontal microcode arrangement, we adopt the vertical microcode that groups the same field of control signals to reduce the number of bits, e.g., bits[13:2] in horizontal microcode are replaced by bits[5:2] in vertical microcode, Encoded_update_flag; Table 7 displays bit allocation of control signals in vertical microcode. The number of bits for one microinstruction reduces from 28 bits to 20 bits while applying vertical microcodes, and microinstructions of vertical microcode are displayed in Table 8. Hence, this bit reduction keeps the flexibility and extensibility of microprogrammed controller. With the defined microcodes, we achieve reconfigurable parser by loading the corresponding microcode to memory in Fig. 6; and hence, the proposed parser could dynamically configure the architectural functionalities according to operating modes.
Table 7

Bits allocation in vertical microcode of control signals in both standards.

Bit Allocation

Control signals

AVC/H.264

MPEG-2

[19:17]

CL_sel

CL_sel

[16]

Entropy_SysCtrl_FIFO_CEN

Entropy_SysCtrl_FIFO_CEN

[15:13]

AddrOffset_Entropy_SysCtrl_FIFO_update_sel

AddrOffset_Entropy_SysCtrl_FIFO_update_sel

[12:10]

InData_Entropy_SysCtrl_FIFO_sel

InData_Entropy_SysCtrl_FIFO_sel

[9:6]

Microprogrammed_Contorl_Signal

Microprogrammed_Contorl_Signal

[5:2]

Encoded_update_flags

Encoded_update_flags

[1]

ExpGolomb_sel0

VLC_decoder_start

[0]

ExpGolomb_sel1

x

Table 8

Vertical microcode for AVC/H.264 and MPEG-2.

Address

Microcode

AVC/H.264

MPEG-2

0

11000100010001011000

11000100010001011000

1

10111110000010000001

10011111110010000000

2

00111110000000001100

11111111110011000010

3

10100010000011000101

11011111110100111100

4

11011110000100111100

11100010000101000110

5

00000011010101011100

01000000000110011100

6

00000011010110011100

00111111110111101000

7

10100011000111100101

01111111111000101100

8

10100010101000010001

00111111110000100000

9

10100110111001100001

11100011011001001110

10

10101001001010100110

11111111111010001010

11

10111110011011001001

10100001011011100100

12

00100100011100101000

11111111111100010010

13

10100100010000101110

00111111110000111100

14

11111110011101010100

11111111111101010110

4 Experimental Results

Parsers of AVC/H.264 and MPEG-2 are seldom discussed in literatures [20, 21]; even if it is discussed, the goal is different from us [22]. Most literatures discussed about the entropy decoder but did not involve the parser [23, 24]. As mentioned in introduction section, the next state generator of FSM can also be a microsequencer. Hence, we implemented two FSM-based controller to compare with the proposed microprogrammed controller in reconfigurable parser. The output control logic of FSM-based controller is hardcoded and implemented in combinational logics individually for both standards; on the other hand, the proposed microprogrammed controller can be flexibly applied for different standards by means of changing the microcode memory. Figures 7 and 8 show two reference implementations of FSM-based controller, respectively. Type I uses the microsequencer in Fig. 6 for both AVC/H.264 and MPEG-2, but output control logics are realized separately; Type II employs two individual microsequencers and output control logics, which are ptimized for each standard. Table 9 displays architectural costs in terms of timing requirement of critical path and gate counts. The timing requirement of critical path and gate count of microcode memory, output control logic of the microprogrammed controller, are extracted from Artisan memory compiler in TSMC 0.18 μm CMOS technology. The gate counts and timing requirement of critical path of FSM-based controller are reported from the synthesized result using TSMC 0.18 μm CMOS technology; the operating frequency of three realizations is set to 108 MHz. According to the timing requirement of critical path, the proposed reconfigurable parser with microprogrammed controller is capable of increasing higher throughput rate. It reduces about up to 1.7 ns in comparison with customized FSM-based controller. In addition, one important advantage is the predictable timing requirement, it offers designers a blue print of timing schedule in early desgin phase.
Figure 7

FSM-based controller – type I.

Figure 8

FSM-based controller – type II.

Table 9

Comparison microprogrammed control with FSM-based control.

Different microarchitectures

Output control logic

Microsequencer

Critical path (ns)

Gate count (gates)

Gate count (gates)

AVC/H.264

MPEG-2

AVC/H.264

MPEG-2

Total

AVC/H.264

MPEG-2

Total

FSM-based controller – type I

2.61

3.19

842

819

1661

3078

  

FSM-based controller – type II

     

1537

1774

3311

Proposed microprogrammed controller

1.49

1324

2992

     

The gate counts of microprogrammed controller are also less than FSM-based controllers, and the proposed architecture saves 20.29 % ((1661–1324) / 1661) in output control logic. For whole controller, including output control logics and microsequencer, the proposed microprogrammed controller reduces 8.93 %. (((1661 + 3078)–(1324 + 2992)) / (1661 + 3078) = 8.93 %) That is, considering the throughput and gate counts together, the proposed microprogrammed controller provides higher flexibility without any overhead.

5 Conclusion

With the benefits of reconfigurable architecture, we achieve better performance with less architecture cost in comparison with the individual implementations for multiple purposes. In our case study, we develop the reconfigurable parser, which dynamically configures functionality to support multi-purpose. Due to the characteristic of feedback control in parser, we apply control-dataflow for co-exploring control unit and datapath. To define the microinstruction, we specify the control signals in the datapath; thus most control signals are shared for different purposes. With the shared control signals, we remove the overhead on switching the datapath for distinct purposes; furthermore, we adopted the vertical microcode to reduce the number of bits in microinstruction through grouping the control signals in the same field. Vertical microcode reduces the storage size for microinstructions and retains more reserved bits that can be used to extend control signals in integrating more purposes. The experimental results show proposed architecture with higher flexibility beats the FSM-based controller when considering multiple standards. Due to high proportion of shared control lines in various purposes, we could predict the trend that reconfigurable parser with microprogrammed controller could overwhelm FSM-based controller if more applications are supported.

Notes

Compliance with ethical standards

We claim that the research work in this manuscript did not involve any Human Participants and/or Animals and no informed consent forms are required to conduct the work in this manuscript. Furthermore, there is no confliction of interest against the disclosure of this manuscript.

References

  1. 1.
    Lee, G. G., Wang, M.-J., Chen, B.-H., Chen, J., Jao, P.-K., Hsiao, C.-J., et al. (2011). Reconfigurable Architecture for Deinterlacer based on Algorithm/Architecture Co-Design. Journal of Signal Processing Systems, 63(2), 181–189. doi:10.1007/s11265-009-0388-6.CrossRefGoogle Scholar
  2. 2.
    Lee, G.-G., Yang, W.-C., Wu, M.-S., & Lin, H.-Y. (2010). Reconfigurable architecture design of motion compensation for multi-standard video coding. In Circuits and Systems (ISCAS), Proceedings of 2010 I.E. International Symposium on, May 30 2010-June 2 2010 (pp. 2003–2006). doi:10.1109/ISCAS.2010.5537127.
  3. 3.
    Hunag, T.-Y., Lin, H.-Y., Chen, C.-F., & Lee, G. G (2011). Reconfigurable inverse transform architecture for multiple purpose video coding. In Circuits and Systems (ISCAS), 2011 I.E. International Symposium on, 15–18 May 2011 (pp. 1223–1226). doi:10.1109/ISCAS.2011.5937790.
  4. 4.
    Patterson, D. A., & Hennessy, J. L. (2013). Computer organization and design: the hardware/software interface: Newnes.Google Scholar
  5. 5.
    ISO/IEC 13818–2 (1996). Information Technology—Coding of moving pictures and associated audio.Google Scholar
  6. 6.
    ITU-T Recommendation H.264 (2005), ‘Advanced video coding for generic audiovisual services,’ DraftGoogle Scholar
  7. 7.
    Sullivan, G. J., Ohm, J., Han, W.-J., & Wiegand, T. (2012). Overview of the High Efficiency Video Coding (HEVC) standard. Circuits and Systems for Video Technology, IEEE Transactions on, 22(12), 1649–1668. doi:10.1109/TCSVT.2012.2221191.CrossRefGoogle Scholar
  8. 8.
    ISO/IEC 23001–4:2014 (2014). Information technology—MPEG systems technologies—Part 4: Codec configuration representation.Google Scholar
  9. 9.
    ISO/IEC 23002–4:2014 (2014). Information technology—MPEG video technologies—Part 4: Video tool library, 2014.Google Scholar
  10. 10.
    Zarrineh, K., & Upadhyaya, S. J (1999). On programmable memory built-in self test architectures. In Design, Automation and Test in Europe Conference and Exhibition 1999. Proceedings, 1999 (pp. 708–713). doi:10.1109/DATE.1999.761207.
  11. 11.
    Barkalov, A., Titarenko, L., & Bieganowski, J (2010). Microprogram control unit with code sharing and extended microinstruction format. In Design & Test Symposium (EWDTS), 2010 East–west, 17–20 Sept. 2010 (pp. 73–76). doi:10.1109/EWDTS.2010.5742041.
  12. 12.
    Kannangara, C. S., Philp, J. M., Richardson, I. E., Bystrom, M., & de Frutos Lopez, M. (2010). A syntax for defining, communicating, and implementing video decoder function and structure. Circuits and Systems for Video Technology, IEEE Transactions on, 20(9), 1176–1186. doi:10.1109/TCSVT.2010.2051274.CrossRefGoogle Scholar
  13. 13.
    Lucarz, C., Piat, J., & Mattavelli, M. (2011). Automatic synthesis of parsers and validation of bitstreams within the MPEG reconfigurable video coding framework. Journal of Signal Processing Systems, 63(2), 215–225. doi:10.1007/s11265-009-0395-7.CrossRefGoogle Scholar
  14. 14.
    Kim, H., Kim, S., Lee, S., & Jang, E. S. (2013). Parser description-based bitstream parser generation for MPEG RMC framework. Signal Processing: Image Communication, 28(10), 1255–1277. doi:10.1016/j.image.2013.08.011.Google Scholar
  15. 15.
    ‘Text of ISO/IEC 23001–4:2014/PDAM1 Parser instantiation from BSD,’ ISO/IEC JTC1/SC29/WG11 N14963, Strasbourg, FR, Oct. 2014.Google Scholar
  16. 16.
    Kim, H., Dong, T., Lee, S., Choi, J., & Jang, E. S. (2014). ‘RVC CE2: Technical Updates and Automated RVC-BSDL Translator for Generic Parser FU (GPFU)’, ISO/IEC JTC1/SC29/WG11 M32447, San Jose, CA.Google Scholar
  17. 17.
    Lee, G. G., Chen, C.-F., Xu, S.-M., & Hsiao, C.-J. ‘High-Throughput Reconfigurable Variable Length Coding Decoder for MPEG-2 and AVC/H.264,’ has been accepted in Journal of Signal Processing Systems.Google Scholar
  18. 18.
    Lee, G. G., Xu, S.-M., Chen, C.-F., & Hsiao, C.-J. (2012). Architecture of high-throughput context adaptive variable length coding decoder in AVC/H.264. In Signal & Information Processing Association Annual Summit and Conference (APSIPA ASC), 2012 Asia-Pacific, 3–6 Dec. 2012(pp. 1–5)Google Scholar
  19. 19.
    Lee, G. G., Chen, Y.-K., Mattavelli, M., & Jang, E. S. (2009). Algorithm/Architecture co-exploration of visual computing on emergent platforms: overview and future prospects. Circuits and Systems for Video Technology, IEEE Transactions on, 19(11), 1576–1587. doi:10.1109/TCSVT.2009.2031376.CrossRefGoogle Scholar
  20. 20.
    Wang, S.-H., Peng, W.-H., He, Y., Lin, G.-Y., Lin, C.-Y., Chang, S.-C., et al. (2005). A software-hardware co-implementation of MPEG-4 Advanced Video Coding (AVC) decoder with block level pipelining. Journal of VLSI signal processing systems for signal, image and video technology, 41(1), 93–110. doi:10.1007/s11265-005-6253-3.CrossRefGoogle Scholar
  21. 21.
    Dajiang, Z., Jinjia, Z., Xun, H., Jiayi, Z., Ji, K., Peilin, L., et al. (2011). A 530 Mpixels/s 4096x2160@60fps H.264/AVC High Profile Video Decoder Chip. Solid-State Circuits, IEEE Journal of, 46(4), 777–788. doi:10.1109/JSSC.2011.2109550.CrossRefGoogle Scholar
  22. 22.
    Ke, X., Chiu-Sing, C., Cheong-Fat, C., & Kong-Pong, P (2006). Power-efficient VLSI implementation of bitstream parsing in H.264/AVC decoder. In Circuits and Systems, 2006. ISCAS 2006. Proceedings. 2006 I.E. International Symposium on, 21–24 May 2006 (pp. 4 pp.). doi:10.1109/ISCAS.2006.1693839.
  23. 23.
    Chang, Y.-T., & Chung, W.-H (2009). A high-performance entropy decoding system for H.264/AVC. In Multimedia and Expo, 2009. ICME 2009. IEEE International Conference on, June 28 2009-July 3 2009 (pp. 1090–1093). doi:10.1109/ICME.2009.5202688.
  24. 24.
    Jeonhak, M., & Seongsoo, L (2008). Design of H.264/AVC entropy decoder without internal ROM/RAM memories. In Communications, Control and Signal Processing, 2008. ISCCSP 2008. 3rd International Symposium on, 12–14 March 2008 (pp. 1464–1467). doi:10.1109/ISCCSP.2008.4537458.

Copyright information

© Springer Science+Business Media New York 2016

Authors and Affiliations

  • Gwo Giun (Chris) Lee
    • 1
  • Chun-Fu (Richard) Chen
    • 1
  • Ching-Jui Hsiao
    • 1
  1. 1.Department of Electrical EngineeringNational Cheng Kung UniversityTainanRepublic of China

Personalised recommendations